precisionconverter

Precision conversion module for ONNX models.

This module provides functionality for converting ONNX models between different floating point precisions, specifically handling conversions between FP32 and lower precisions like FP16 or BF16. It handles the insertion of cast operations, conversion of initializers, and ensures model validity through type checking and cleanup of redundant operations.

Classes

InitializerConsumerTracker

A class that tracks the nodes that consume an initializer.

InputIndexTracker

A class that tracks the index of an input to a node.

PrecisionConverter

Precision conversion module for ONNX models.

PrecisionTypes

PrecisionTypes(onnx_type, numpy_type, str_short, str_full)

class InitializerConsumerTracker

Bases: object

A class that tracks the nodes that consume an initializer.

__init__(low_precision_nodes=<factory>, high_precision_nodes=<factory>)
Parameters:
Return type:

None

high_precision_nodes: list[InputIndexTracker]
low_precision_nodes: list[InputIndexTracker]
class InputIndexTracker

Bases: object

A class that tracks the index of an input to a node.

__init__(node, node_index)
Parameters:
  • node (NodeProto)

  • node_index (int)

Return type:

None

node: NodeProto
node_index: int
class PrecisionConverter

Bases: object

Precision conversion module for ONNX models.

This module provides functionality for converting ONNX models between different floating point precisions, specifically handling conversions between FP32 and lower precisions like FP16 or BF16. It handles the insertion of cast operations, conversion of initializers, and ensures model validity.

Public Methods:

convert: Convert specified nodes to FP16/BF16 precision while keeping others in FP32.

__init__(model, value_info_map, initializer_map, node_to_init_map, keep_io_types=False, low_precision_type='fp16', init_conversion_max_bytes=None, custom_ops=None, min_opset=13, max_ir_version=None, trt_plugins=[])

Initialize PrecisionConverter.

Parameters:
  • model (ModelProto) – ONNX model to convert.

  • value_info_map (dict[str, ValueInfoProto]) – Map of tensor names to value info.

  • initializer_map (dict[str, TensorProto]) – Map of tensor names to initializers.

  • node_to_init_map (dict[str, list[TensorProto]]) – Map of node names to lists of initializer names.

  • keep_io_types (bool) – Keep the input and output types of the model, otherwise they will be converted.

  • low_precision_type (str) – Precision to convert to.

  • init_conversion_max_bytes (int | None) – Maximum size in bytes for initializer conversion. Larger initializers will be cast at runtime.

  • custom_ops (set[str] | None) – List of custom ops.

  • min_opset (int)

  • max_ir_version (int | None)

  • trt_plugins (list[str] | None)

Return type:

None

convert(high_precision_nodes, low_precision_nodes)

Convert model to mixed precision.

Parameters:
  • high_precision_nodes (list[str]) – List of node names to keep in high precision.

  • low_precision_nodes (list[str]) – List of node names to convert to low precision.

Returns:

The converted mixed precision model.

Return type:

onnx.ModelProto

class PrecisionTypes

Bases: tuple

PrecisionTypes(onnx_type, numpy_type, str_short, str_full)

static __new__(_cls, onnx_type, numpy_type, str_short, str_full)

Create new instance of PrecisionTypes(onnx_type, numpy_type, str_short, str_full)

numpy_type

Alias for field number 1

onnx_type

Alias for field number 0

str_full

Alias for field number 3

str_short

Alias for field number 2