model_calib

Calibration utilities.

Functions

calibrate

Adjusts weights and scaling factors based on selected algorithms.

postprocess_amax

Experimental API to postprocess the amax values after calibration.

calibrate(model, algorithm='max', forward_loop=None)

Adjusts weights and scaling factors based on selected algorithms.

Parameters:
  • model (Module) – A pytorch model with quantizer modules.

  • algorithm (str | MaxCalibConfig | SmoothQuantCalibConfig | AWQLiteCalibConfig | AWQClipCalibConfig | AWQFullCalibConfig | RealQuantizeConfig | None) – A string or dictionary specifying the calibration algorithm to use. Supported algorithms are "max", "smoothquant", "awq_lite", "awq_full", and "awq_clip". If a dictionary is passed, the key "method" should specify the calibration algorithm to use. Other key-value pairs in this dictionary will be passed as kwargs to the algorithm. An example dictionary argument: {"method": "awq_clip", "max_co_batch_size": 4096}. If None, no calibration is performed. For real quantization, the key method should be real_quantize, and the calibration algorithm used should be specified in additional_algorithm.

  • forward_loop (Callable[[Module], None] | None) – A callable which takes the model as argument and forwards calibration data through the model. This is not required for weight-only quantization with the "max" algorithm.

Return type:

Module

Returns: The calibrated pytorch model.

postprocess_amax(model, key, post_process_fn)

Experimental API to postprocess the amax values after calibration.

Parameters:
  • model (Module) –

  • key (str) –

Return type:

Module