conversion

Quantization conversion/restore utilities.

Functions

replace_quant_module

Recursively replace the module with quantized module.

set_quantizer_by_cfg

Update the quantizer attributes based on the specified quant_cfg.

set_quantizer_attribute

Finegrained adjustment of quantizer attribute by wildcard or filter function.

register

Register a quantized class for the given un-quantized original class.

unregister

Unregister the quantized class for the given un-quantized original class.

register(original_cls, quantized_cls)

Register a quantized class for the given un-quantized original class.

Parameters:
  • original_cls (Module) – The original un-quantized class.

  • quantized_cls (Module) – The quantized class. This class should have a _setup method which initializes various quantizers called in the forward. The forward function of the quantized class should call the quantizers at the correct location.

Here is an example of defining a quantized class and registering it:

import modelopt.torch.quantization as mtq
from modelopt.torch.quantization.tensor_quant import TensorQuantizer, QuantDescriptor


class QuantLayerNorm(nn.LayerNorm):
    def __init__(self, normalized_shape):
        super().__init__(normalized_shape)
        self._setup()

    def _setup(self):
        # Method to setup the quantizers
        self.input_quantizer = TensorQuantizer(QuantDescriptor())
        self.weight_quantizer = TensorQuantizer(QuantDescriptor())

    def forward(self, input):
        input = self.input_quantizer(input)
        weight = self.weight_quantizer(self.weight)
        return F.layer_norm(input, self.normalized_shape, weight, self.bias, self.eps)


# Register the custom quantized module
mtq.register(original_cls=nn.LayerNorm, quantized_cls=QuantLayerNorm)
replace_quant_module(model)

Recursively replace the module with quantized module.

Parameters:

model (Module) –

set_quantizer_attribute(quant_model, wildcard_or_filter_func, attribute)

Finegrained adjustment of quantizer attribute by wildcard or filter function.

Parameters:
  • quant_model (Module) – A pytorch model

  • wildcard_or_filter_func (str | Callable) – a wildcard string or a filter function. The wildcard string is matched against the quantizer module names. The quantizer modules are instances of TensorQuantizer. The filter function takes a quantized module name as input and returns True if the quantizer should be adjusted and False otherwise.

  • attribute – a dict of quantizer attributes or a list of quantizer attribute dicts. An example attribute dict is: {"num_bits": 8, "axis": 0, "enable": True}. If attribute is a list of dicts, the matched TensorQuantizer modules will be replaced with SequentialQuantizer modules having one quantizer for each attribute dict from the list. See set_from_attribute_dict for more details on the supported attributes and their types.

set_quantizer_by_cfg(quant_model, quant_cfg)

Update the quantizer attributes based on the specified quant_cfg.

quant_cfg is a dictionary mapping wildcards or filter functions to its quantizer attributes. The wildcards or filter functions are matched against the quantizer module names. The specified quantizer attributes of the matched quantizer modules are set accordingly.

See set_quantizer_attribute for more details.

Parameters:

quant_model (Module) –

unregister(original_cls)

Unregister the quantized class for the given un-quantized original class.

Parameters:

original_cls (Module) – The original un-quantized class.