precisionconverter

Precision conversion module for ONNX models.

This module provides functionality for converting ONNX models between different floating point precisions, specifically handling conversions between FP32 and lower precisions like FP16 or BF16. It handles the insertion of cast operations, conversion of initializers, and ensures model validity through type checking and cleanup of redundant operations.

Classes

`PrecisionConverter`	Precision conversion module for ONNX models.
`PrecisionTypes`	PrecisionTypes(onnx_type, numpy_type, str_short, str_full)

class PrecisionConverter

Bases: object

Precision conversion module for ONNX models.

This module provides functionality for converting ONNX models between different floating point precisions, specifically handling conversions between FP32 and lower precisions like FP16 or BF16. It handles the insertion of cast operations, conversion of initializers, and ensures model validity.

Public Methods:: convert: Convert specified nodes to FP16/BF16 precision while keeping others in FP32.

__init__(model, value_info_map, initializer_map, node_to_init_map, keep_io_types=False, low_precision_type='fp16', init_conversion_max_bytes=None, custom_ops=None)

Initialize PrecisionConverter.

Parameters:

model (ModelProto) – ONNX model to convert.
value_info_map (dict[str, ValueInfoProto]) – Map of tensor names to value info.
initializer_map (dict[str, TensorProto]) – Map of tensor names to initializers.
node_to_init_map (dict[str, list[str]]) – Map of node names to lists of initializer names.
keep_io_types (bool) – Keep the input and output types of the model, otherwise they will be converted.
low_precision_type (str) – Precision to convert to.
init_conversion_max_bytes (int | None) – Maximum size in bytes for initializer conversion. Larger initializers will be cast at runtime.
custom_ops (set[str] | None) – List of custom ops.

Return type:

None

convert(high_precision_nodes, low_precision_nodes)

Convert model to mixed precision.

Parameters:

high_precision_nodes (list[str]) – List of node names to keep in high precision.
low_precision_nodes (list[str]) – List of node names to convert to low precision.

Returns:

The converted mixed precision model.

Return type:

onnx.ModelProto

class PrecisionTypes

Bases: tuple

PrecisionTypes(onnx_type, numpy_type, str_short, str_full)

static __new__(_cls, onnx_type, numpy_type, str_short, str_full): Create new instance of PrecisionTypes(onnx_type, numpy_type, str_short, str_full)

numpy_type: Alias for field number 1

onnx_type: Alias for field number 0

str_full: Alias for field number 3

str_short: Alias for field number 2