referencerunner

Reference runner module for ONNX model execution.

This module provides functionality for running ONNX models using ONNXRuntime as a reference implementation. It supports both random input generation and user-provided inputs through NPZ or Polygraphy JSON files. The runner is used to analyze model behavior and validate outputs during precision conversion.

When multiple batches of calibration data are provided, the runner aggregates statistics across all batches to provide more robust range information for precision conversion decisions.

Classes

ReferenceRunner

A class to run ONNX models with ONNXRuntime for reference inference.

TensorStats

Statistics for a tensor aggregated across multiple batches.

class ReferenceRunner

Bases: object

A class to run ONNX models with ONNXRuntime for reference inference.

__init__(model, providers=['cpu'], trt_plugins=[])

Initialize with ONNX model path.

Parameters:
  • model (ModelProto)

  • providers (list[str])

  • trt_plugins (list[str])

run(inputs=None)

Run FP32 inference with provided or random inputs.

When multiple batches of input data are provided, inference is run for each batch and statistics are aggregated across all batches for more robust range estimation.

Parameters:

inputs – Optional input data. Can be: - None: Random inputs will be generated - str: Path to JSON file, NPZ file, or directory containing NPZ files - dict/OrderedDict: Single batch of input data

Returns:

Combined input and output data. For single batch, returns raw arrays.

For multiple batches, returns TensorStats objects with aggregated statistics.

Return type:

OrderedDict

class TensorStats

Bases: object

Statistics for a tensor aggregated across multiple batches.

__init__(absmax, min_val, max_val, shape)
Parameters:
  • absmax (float)

  • min_val (float)

  • max_val (float)

  • shape (tuple)

Return type:

None

absmax: float

Maximum absolute value across all batches.

max_val: float

Maximum value across all batches.

min_val: float

Minimum value across all batches.

shape: tuple

Shape of the tensor (from first batch).

property size

Return total number of elements.