histogram

Histogram based calibrators.

Classes

HistogramCalibrator

Unified histogram calibrator.

Functions

calibrate_weights

Calibrate weights of all child quantized modules.

class HistogramCalibrator

Bases: _Calibrator

Unified histogram calibrator.

Histogram will be only collected once. compute_amax() performs entropy, percentile, or mse

calibration based on arguments

Parameters:
  • num_bits – An integer. Number of bits of quantization.

  • axis – A tuple. see QuantizerAttributeConfig.

  • unsigned – A boolean. using unsigned quantization.

  • num_bins – An integer. Number of histograms bins. Default 2048.

  • grow_method – A string. DEPRECATED. default None.

  • skip_zeros – A boolean. If True, skips zeros when collecting data for histogram. Default False.

  • torch_hist – A boolean. If True, collect histogram by torch.histc instead of np.histogram. If input tensor is on GPU, histc will also be running on GPU. Default True.

__init__(num_bits=8, axis=None, unsigned=False, num_bins=2048, grow_method=None, skip_zeros=False, torch_hist=True)

Initialize.

collect(x)

Collect histogram.

compute_amax(method, *, stride=1, start_bin=128, percentile=99.99)

Compute the amax from the collected histogram.

Parameters:
  • method (str) – A string. One of [‘entropy’, ‘mse’, ‘percentile’]

  • stride (int) –

  • start_bin (int) –

  • percentile (float) –

Keyword Arguments:
  • stride – An integer. Default 1

  • start_bin – An integer. Default 128

  • percentils – A float number between [0, 100]. Default 99.99.

Returns:

a tensor

Return type:

amax

reset()

Reset the collected histogram.

calibrate_weights(model, method='percentile', perchannel=True, percentile=99.99, num_bins=2048)

Calibrate weights of all child quantized modules.

Ideally, we would split calibration functionality to histogram collector and calibrator which takes histogram and compute amax. But since we haven’t decoupled collector and calibrator, it is easier to create a separate function to calibrate weight.

Note

This function uses method specified by the argument to decide which method to use, NOT the one specified by the calibrator embedded in weight_quantizer. We haven’t moved calibration to GPU, so everything is transfered to CPU

Parameters:
  • model – A torch.nn.Module.

  • method – A string of calibration method. Supports “mse” and “percentile”. Default “percentile”

  • perchannel – A bool. Set channel/neuron axis if True. Default True.

  • percentile – A float. Default 99.99

  • num_bins – A integer. Number of bins of histogram. Default 2048.