tensorrt_llm_utils

Utils for TensorRT-LLM checkpoint export.

Some of the logics in this file are empirical and needs constant update if exceptions occur.

Functions

convert_to_tensorrt_llm_config

Convert to TensorRT-LLM checkpoint config.

is_tensorrt_llm_0_8_or_9

Returns true if tensorrt_llm version is 0.8 or 0.9.

prepare_enc_dec_decoder_layer

Prepare the config for each decoder layer of encoder-decoder model.

prepare_enc_dec_export_dir

Prepare the export directory for encoder-decoder model.

weights_to_npz

Export the model_config and the weights in the backward-compatible npz forward.

convert_to_tensorrt_llm_config(model_config, weight_keys=['lm_head'], tp_size_overwrite=None)

Convert to TensorRT-LLM checkpoint config.

Parameters:
  • model_config (ModelConfig) – The model_config to convert.

  • weight_keys (Iterable[str]) – The iterable of string of weights exported to the tensorrt_llm checkpoint.

  • tp_size_overwrite (int | None) – overwrites the tp_size in config.mapping, set only only for phi with TP. This is because the TRT-LLM builder expects its checkpoint to be unsharded.

is_tensorrt_llm_0_8_or_9()

Returns true if tensorrt_llm version is 0.8 or 0.9.

prepare_enc_dec_decoder_layer(layer_config, model_config, enc_dec, layers)

Prepare the config for each decoder layer of encoder-decoder model.

Parameters:
prepare_enc_dec_export_dir(tensorrt_llm_config, export_root)

Prepare the export directory for encoder-decoder model.

Parameters:
  • tensorrt_llm_config (Dict[str, Any]) –

  • export_root (Path) –

weights_to_npz(weights, tensorrt_llm_config, export_dir)

Export the model_config and the weights in the backward-compatible npz forward.

Parameters:
  • weights (Dict[str, ndarray]) –

  • tensorrt_llm_config (Dict[str, Any]) –

  • export_dir (Path) –