trt_utils

This module contains TensorRT utils.

Functions

`get_custom_layers`	Gets custom layers in ONNX model.
`infer_types_shapes`	Updates tensor shapes in ONNX graph.
`infer_types_shapes_tensorrt`	Update tensor types and shapes from TensorRT inference data.
`interpret_trt_plugins_precision_flag`	Convert custom ops precision flag to dictionaries with custom op and I/O indices to be cast/quantized.
`load_onnx_model`	Load ONNX model.
`set_trt_plugin_domain`	Set TensorRT plugin domain info in the graph.

get_custom_layers(onnx_path, trt_plugins, strongly_typed=False)

Gets custom layers in ONNX model.

Parameters:

onnx_path (str | ModelProto) – Path or ModelProto of the input ONNX model.
trt_plugins (list[str] | None) – list with paths to custom TensorRT plugins.
strongly_typed (bool) – Boolean indicating whether to run TensorRT inference in stronglyTyped mode or not.

Returns:

List of custom layers. Dictionary containing tensors information: {‘tensor_name’: {‘shape’: tensor.shape, ‘dtype’: tensor.dtype}}

Return type:

tuple[list[str], dict]

infer_types_shapes(model, all_tensor_info)

Updates tensor shapes in ONNX graph.

Parameters:

model (ModelProto) – ONNX model.
all_tensor_info (dict) – Dictionary containing tensors information.

Returns:

ONNX model with inferred types and shapes.

Return type:

onnx.ModelProto

infer_types_shapes_tensorrt(model, trt_plugins=[], all_tensor_info={}, strongly_typed=False)

Update tensor types and shapes from TensorRT inference data.

Parameters:

model (ModelProto) – ONNX model to infer types and shapes.
trt_plugins (list[str]) – list of TensorRT plugin library paths in .so format (compiled shared library).
all_tensor_info (dict) – dictionary with tensor data from TensorRT run.
strongly_typed (bool) – boolean indicating if the TensorRT run should be stronglyTyped or not.

Returns:

ONNX model with inferred types and shapes.

Return type:

onnx.ModelProto

interpret_trt_plugins_precision_flag(onnx_model, trt_plugins_precision, quantize_mode)

Convert custom ops precision flag to dictionaries with custom op and I/O indices to be cast/quantized.

Parameters:

onnx_model (ModelProto) – ONNX model to detect with nodes need to be cast/quantized.
trt_plugins_precision (list[str]) – List indicating the precision for each custom op.
quantize_mode (str) – String indicating the quantization mode.

Returns:

Dictionary with custom ops to cast containing the I/O indices to cast. Dictionary with custom ops to quantize containing the I/O indices to quantize.

Return type:

tuple[dict, dict]

load_onnx_model(onnx_path, trt_plugins=None, override_shapes=None, use_external_data_format=False, intermediate_generated_files=None)

Load ONNX model. If ‘tensorrt’ is installed, check if the model has custom ops and ensure it’s supported by ORT.

Parameters:

onnx_path (str) – Path to the input ONNX model.
trt_plugins (list[str] | None) – List with paths to custom TensorRT plugins.
override_shapes (str | None) – Override model input shapes with static shapes.
use_external_data_format (bool) – If True, separate data path will be used to store the weights of the quantized model.
intermediate_generated_files (list[str] | None) – List of paths of intermediate ONNX files, generated during quantization.

Returns:

Loaded ONNX model supported by ORT. Boolean indicating whether the model has custom ops or not. List of custom ops in the ONNX model. Path to new intermediary ONNX model. Boolean indicating whether we should use external data format for the intermediate and quantized models.

Return type:

tuple[ModelProto, bool, list[str], str, bool]

set_trt_plugin_domain(model, custom_ops)

Set TensorRT plugin domain info in the graph.

Parameters:

model (ModelProto) – ONNX model to set custom op domain.
custom_ops (list[str]) – list of custom ops.

Returns:

ONNX model with domain set in custom ops.

Return type:

onnx.ModelProto