tensorrt_llm_utils

Utils for TensorRT-LLM checkpoint export.

Some of the logics in this file are empirical and needs constant update if exceptions occur.

Functions

`convert_to_tensorrt_llm_config`	Convert to TensorRT-LLM checkpoint config.
`is_tensorrt_llm_0_8_or_9`	Returns true if tensorrt_llm version is 0.8 or 0.9.
`prepare_enc_dec_export_dir`	Prepare the export directory for encoder-decoder model.
`prepare_t5_decoder_layer`	Prepare the config for each decoder layer of encoder-decoder model.

convert_to_tensorrt_llm_config(model_config, quant_config, hf_config=None)

Convert to TensorRT-LLM checkpoint config.

Parameters:

model_config (ModelConfig) – The model_config to convert.
quant_config (dict[str, Any]) – The quantization config to convert. It will be updated with kv_cache_quant_algo.
hf_config – The huggingface model config. If provided, we try to use the TensorRT-LLM’s export method if available.

is_tensorrt_llm_0_8_or_9(): Returns true if tensorrt_llm version is 0.8 or 0.9.

prepare_enc_dec_export_dir(tensorrt_llm_config, export_root)

Prepare the export directory for encoder-decoder model.

Parameters:

prepare_t5_decoder_layer(layer_config, model_config, enc_dec, layers)

Prepare the config for each decoder layer of encoder-decoder model.

Parameters: