export_onnx
Utility to export a quantized torch model to quantized ONNX.
Functions
Export quantized model to FP8 ONNX. |
|
Export quantized model to INT8 ONNX. |
- export_fp8(g, inputs, amax, trt_high_precision_dtype)
Export quantized model to FP8 ONNX.
- Parameters:
g (GraphContext) –
inputs (Value) –
amax (float) –
trt_high_precision_dtype (str) –
- export_int8(g, inputs, amax, num_bits, unsigned, narrow_range, trt_high_precision_dtype)
Export quantized model to INT8 ONNX.
- Parameters:
g (GraphContext) –
inputs (Value) –
amax (Tensor) –
num_bits (int) –
unsigned (bool) –
narrow_range (bool) –
trt_high_precision_dtype (str) –