export_utils
Utilities for exporting LLM models to ONNX.
Classes
A class to handle HuggingFace model loading and configuration. |
|
Rope type enum. |
|
Wrapper Model to ensure all models have the same I/O. |
Functions
Export the WrapperModelForCausalLM to ONNX with fixed I/O names and shape definitions and save to output_dir. |
|
Export the model to ONNX. |
- class ModelLoader
Bases:
object
A class to handle HuggingFace model loading and configuration.
- __init__(torch_dir, config_path)
Initialize the ModelLoader.
- get_model_type()
Get model type from config file.
- get_rope_type()
Get rope type.
- load_model()
Load HuggingFace model based on model type.
- class RopeType
Bases:
Enum
Rope type enum.
- K_MROPE = 3
- K_NONE = 0
- K_ROPE_ROTATE_GPTJ = 1
- K_ROPE_ROTATE_NEOX = 2
- class WrapperModelForCausalLM
Bases:
Module
Wrapper Model to ensure all models have the same I/O.
- __init__(model)
Initialize the WrapperModelForCausalLM.
- forward(input_ids, past_key_values)
Forward pass.
- llm_to_onnx(model, output_dir, extra_inputs={}, extra_dyn_axes={})
Export the WrapperModelForCausalLM to ONNX with fixed I/O names and shape definitions and save to output_dir.
- Parameters:
model – torch.Module
output_dir – str, the output_dir of the original ONNX.
extra_inputs – dict, append additional inputs after kv_cache. Usually for VL models
extra_dyn_axes – dict. Usually for VL models
- torch_to_onnx(model, inputs, onnx_dir, onnx_name, input_names, output_names, dynamic_axes)
Export the model to ONNX.