modelopt.onnx.autocast.convert
- convert(onnx_path, low_precision_type='fp16', nodes_to_exclude=None, op_types_to_exclude=None, data_max=512, init_max=np.float16(65500.0), keep_io_types=False, calibration_data=None, custom_rule=None, init_conversion_max_bytes=1048576)
Convert model to mixed precision.
- Parameters:
onnx_path (str) – Path to the input ONNX model.
low_precision_type (str) – Target precision to reduce to (‘fp16’ or ‘bf16’).
nodes_to_exclude (list[str] | None) – List of regex patterns to match node names that should remain in FP32.
op_types_to_exclude (list[str] | None) – List of operation types that should remain in FP32.
data_max (float) – Maximum absolute value for node input and output values.
init_max (float) – Maximum absolute value for initializers.
keep_io_types (bool) – Whether to preserve input/output types.
calibration_data (str | None) – Path to input data file for reference runner.
custom_rule (NodeRuleBase | None) – Optional custom rule for node classification (inherits from NodeRuleBase).
init_conversion_max_bytes (int) – Maximum size in bytes for initializer conversion. Larger initializers will be cast at runtime.
- Returns:
The converted mixed precision model.
- Return type:
onnx.ModelProto