tensorrt_llm_utils
Utils for TensorRT-LLM checkpoint export.
Some of the logics in this file are empirical and needs constant update if exceptions occur.
Functions
Convert to TensorRT-LLM checkpoint config. |
|
Returns true if tensorrt_llm version is 0.8 or 0.9. |
|
Prepare the config for each decoder layer of encoder-decoder model. |
|
Prepare the export directory for encoder-decoder model. |
- convert_to_tensorrt_llm_config(model_config, weight_keys, hf_config=None)
Convert to TensorRT-LLM checkpoint config.
- Parameters:
model_config (ModelConfig) – The model_config to convert.
weight_keys (Iterable[str]) – The iterable of string of weights exported to the tensorrt_llm checkpoint.
hf_config – The huggingface model config. If provided, we try to use the TensorRT-LLM’s export method if available.
- is_tensorrt_llm_0_8_or_9()
Returns true if tensorrt_llm version is 0.8 or 0.9.
- prepare_enc_dec_decoder_layer(layer_config, model_config, enc_dec, layers)
Prepare the config for each decoder layer of encoder-decoder model.
- Parameters:
layer_config (DecoderLayerConfig) –
model_config (T5Config) –
enc_dec (str) –
layers (list[DecoderLayerConfig]) –
- prepare_enc_dec_export_dir(tensorrt_llm_config, export_root)
Prepare the export directory for encoder-decoder model.
- Parameters:
tensorrt_llm_config (dict[str, Any]) –
export_root (Path) –