export_utils

Utilities for exporting LLM models to ONNX.

Classes

ModelLoader

A class to handle HuggingFace model loading and configuration.

RopeType

Rope type enum.

WrapperModelForCausalLM

Wrapper Model to ensure all models have the same I/O.

Functions

llm_to_onnx

Export the WrapperModelForCausalLM to ONNX with fixed I/O names and shape definitions and save to output_dir.

torch_to_onnx

Export the model to ONNX.

class ModelLoader

Bases: object

A class to handle HuggingFace model loading and configuration.

__init__(torch_dir, config_path)

Initialize the ModelLoader.

get_model_type()

Get model type from config file.

get_rope_type()

Get rope type.

load_model()

Load HuggingFace model based on model type.

class RopeType

Bases: Enum

Rope type enum.

K_MROPE = 3
K_NONE = 0
K_ROPE_ROTATE_GPTJ = 1
K_ROPE_ROTATE_NEOX = 2
class WrapperModelForCausalLM

Bases: Module

Wrapper Model to ensure all models have the same I/O.

__init__(model)

Initialize the WrapperModelForCausalLM.

forward(input_ids, past_key_values)

Forward pass.

llm_to_onnx(model, output_dir, extra_inputs={}, extra_dyn_axes={})

Export the WrapperModelForCausalLM to ONNX with fixed I/O names and shape definitions and save to output_dir.

Parameters:
  • model – torch.Module

  • output_dir – str, the output_dir of the original ONNX.

  • extra_inputs – dict, append additional inputs after kv_cache. Usually for VL models

  • extra_dyn_axes – dict. Usually for VL models

torch_to_onnx(model, inputs, onnx_dir, onnx_name, input_names, output_names, dynamic_axes)

Export the model to ONNX.