model_calib
Calibration utilities.
Functions
Adjusts weights and scaling factors based on selected algorithms. |
|
Experimental API to postprocess the amax values after calibration. |
- calibrate(model, algorithm='max', forward_loop=None)
Adjusts weights and scaling factors based on selected algorithms.
- Parameters:
model (Module) – A pytorch model with quantizer modules.
algorithm (str | MaxCalibConfig | SmoothQuantCalibConfig | AWQLiteCalibConfig | AWQClipCalibConfig | AWQFullCalibConfig | RealQuantizeConfig | None) – A string or dictionary specifying the calibration algorithm to use. Supported algorithms are
"max"
,"smoothquant"
,"awq_lite"
,"awq_full"
, and"awq_clip"
. If a dictionary is passed, the key"method"
should specify the calibration algorithm to use. Other key-value pairs in this dictionary will be passed as kwargs to the algorithm. An example dictionary argument:{"method": "awq_clip", "max_co_batch_size": 4096}
. IfNone
, no calibration is performed. For real quantization, the keymethod
should bereal_quantize
, and the calibration algorithm used should be specified inadditional_algorithm
.forward_loop (Callable[[Module], None] | None) – A callable which takes the model as argument and forwards calibration data through the model. This is not required for weight-only quantization with the
"max"
algorithm.
- Return type:
Module
Returns: The calibrated pytorch model.
- postprocess_amax(model, key, post_process_fn)
Experimental API to postprocess the amax values after calibration.
- Parameters:
model (Module) –
key (str) –
- Return type:
Module