quantization

Modules

modelopt.torch.quantization.backends

Quantization backends.

modelopt.torch.quantization.calib

Calibrator classes.

modelopt.torch.quantization.compress(model)

Compress model weights of quantized model.

modelopt.torch.quantization.config

This document lists the quantization formats supported by Model Optimizer and example quantization configs.

modelopt.torch.quantization.conversion

Quantization conversion/restore utilities.

modelopt.torch.quantization.export_onnx

Utility to export a quantized torch model to quantized ONNX.

modelopt.torch.quantization.extensions

Module to load C++ / CUDA extensions.

modelopt.torch.quantization.mode

This module contains the mode descriptor for the quantization mode.

modelopt.torch.quantization.model_calib

Calibration utilities.

modelopt.torch.quantization.model_quant

User-facing quantization API.

modelopt.torch.quantization.nn

Modules with quantization support.

modelopt.torch.quantization.optim

Deprecated.

modelopt.torch.quantization.plugins

Handles quantization plugins to correctly quantize third-party modules.

modelopt.torch.quantization.qtensor

Tensor Class for Real Quantization.

modelopt.torch.quantization.quant_modules

Deprecated.

modelopt.torch.quantization.tensor_quant

Basic tensor quantization functions.

modelopt.torch.quantization.triton

Triton quantization kernels.

modelopt.torch.quantization.utils

Quantization utilities.

Quantization package.