tensorrt_llm_utils
Utils for TensorRT-LLM checkpoint export.
Some of the logics in this file are empirical and needs constant update if exceptions occur.
Functions
Convert to TensorRT-LLM checkpoint config. |
|
Returns true if tensorrt_llm version is 0.8 or 0.9. |
|
Prepare the config for each decoder layer of encoder-decoder model. |
|
Prepare the export directory for encoder-decoder model. |
|
Export the model_config and the weights in the backward-compatible npz forward. |
- convert_to_tensorrt_llm_config(model_config, weight_keys=['lm_head'], tp_size_overwrite=None)
Convert to TensorRT-LLM checkpoint config.
- Parameters:
model_config (ModelConfig) – The model_config to convert.
weight_keys (Iterable[str]) – The iterable of string of weights exported to the tensorrt_llm checkpoint.
tp_size_overwrite (int | None) – overwrites the tp_size in config.mapping, set only only for phi with TP. This is because the TRT-LLM builder expects its checkpoint to be unsharded.
- is_tensorrt_llm_0_8_or_9()
Returns true if tensorrt_llm version is 0.8 or 0.9.
- prepare_enc_dec_decoder_layer(layer_config, model_config, enc_dec, layers)
Prepare the config for each decoder layer of encoder-decoder model.
- Parameters:
layer_config (DecoderLayerConfig) –
model_config (T5Config) –
enc_dec (str) –
layers (List[DecoderLayerConfig]) –
- prepare_enc_dec_export_dir(tensorrt_llm_config, export_root)
Prepare the export directory for encoder-decoder model.
- Parameters:
tensorrt_llm_config (Dict[str, Any]) –
export_root (Path) –
- weights_to_npz(weights, tensorrt_llm_config, export_dir)
Export the model_config and the weights in the backward-compatible npz forward.
- Parameters:
weights (Dict[str, ndarray]) –
tensorrt_llm_config (Dict[str, Any]) –
export_dir (Path) –