Python Export Pipeline (Deprecated)#
The historical FX-tracing export package remains available in 0.7.1 for
compatibility and for components that are not yet fully covered by
llm_loader. For new enablement, use:
experimental/quantizationto create unified quantized HuggingFace-style checkpoints when starting from FP16/BF16 source checkpoints.experimental/llm_loaderto export ONNX and runtime sidecars from FP16/BF16, pre-quantized, or locally quantized checkpoints.llm_build,visual_build,audio_build,action_build, and the runtime examples to build engines and run inference.
The tensorrt_edgellm/ folder will be removed in 0.8.0 after the
experimental/quantization -> experimental/llm_loader workflow reaches full
feature parity for all models and features.
Replacement Map#
Legacy responsibility |
Current workflow |
|---|---|
ModelOpt quantization inside the export package |
|
LLM ONNX export |
|
VLM/audio/Omni/VLA component export |
|
EAGLE base export |
|
MTP export |
|
LoRA graph insertion and adapter processing |
|
Static LoRA merge |
|
FP8 KV cache |
|
FP8 embedding sidecar |
|
Vocabulary reduction |
|