histogram
Histogram based calibrators.
Classes
Unified histogram calibrator. |
Functions
Calibrate weights of all child quantized modules. |
- class HistogramCalibrator
Bases:
_Calibrator
Unified histogram calibrator.
- Histogram will be only collected once. compute_amax() performs entropy, percentile, or mse
calibration based on arguments
- Parameters:
num_bits – An integer. Number of bits of quantization.
axis – A tuple. see
QuantizerAttributeConfig
.unsigned – A boolean. using unsigned quantization.
num_bins – An integer. Number of histograms bins. Default 2048.
grow_method – A string. DEPRECATED. default None.
skip_zeros – A boolean. If True, skips zeros when collecting data for histogram. Default False.
torch_hist – A boolean. If True, collect histogram by torch.histc instead of np.histogram. If input tensor is on GPU, histc will also be running on GPU. Default True.
- __init__(num_bits=8, axis=None, unsigned=False, num_bins=2048, grow_method=None, skip_zeros=False, torch_hist=True)
Initialize.
- collect(x)
Collect histogram.
- compute_amax(method, *, stride=1, start_bin=128, percentile=99.99)
Compute the amax from the collected histogram.
- Parameters:
method (str) – A string. One of [‘entropy’, ‘mse’, ‘percentile’]
stride (int) –
start_bin (int) –
percentile (float) –
- Keyword Arguments:
stride – An integer. Default 1
start_bin – An integer. Default 128
percentils – A float number between [0, 100]. Default 99.99.
- Returns:
a tensor
- Return type:
amax
- reset()
Reset the collected histogram.
- calibrate_weights(model, method='percentile', perchannel=True, percentile=99.99, num_bins=2048)
Calibrate weights of all child quantized modules.
Ideally, we would split calibration functionality to histogram collector and calibrator which takes histogram and compute amax. But since we haven’t decoupled collector and calibrator, it is easier to create a separate function to calibrate weight.
Note
This function uses method specified by the argument to decide which method to use, NOT the one specified by the calibrator embedded in weight_quantizer. We haven’t moved calibration to GPU, so everything is transfered to CPU
- Parameters:
model – A torch.nn.Module.
method – A string of calibration method. Supports “mse” and “percentile”. Default “percentile”
perchannel – A bool. Set channel/neuron axis if True. Default True.
percentile – A float. Default 99.99
num_bins – A integer. Number of bins of histogram. Default 2048.