quant_linear

Quantized Linear.

Classes

Linear

alias of QuantLinear

QuantLinear

Quantized version of nn.Linear.

RealQuantLinear

Quantized version of nn.Linear with real quantization.

SVDQuantLinear

Base class for quantized linear modules with SVDQuant.

Linear

alias of QuantLinear

class QuantLinear

Bases: _LegacyQuantLinearConvBaseMixin, Linear

Quantized version of nn.Linear.

default_quant_desc_weight = QuantizerAttributeConfig(enable=True, num_bits=8, axis=0, fake_quant=True, unsigned=False, narrow_range=False, learn_amax=False, type='static', block_sizes=None, bias=None, trt_high_precision_dtype='Float', calibrator='max', rotate=False, pass_through_bwd=False)
class RealQuantLinear

Bases: QuantModule

Quantized version of nn.Linear with real quantization.

allow_real_quant_gemm = True
forward(input, *args, **kwargs)

RealQuant layer forward function.

get_real_quant_gemm_impl(input, *args, **kwargs)

Get the real quant GEMM implementation base on input arguments.

Return type:

bool

list_of_scale_tensors = ['_scale', 'double_scale', '_scale_zeros']
class SVDQuantLinear

Bases: QuantLinearConvBase

Base class for quantized linear modules with SVDQuant.

fold_weight()

Fold the weight for faster eval.

forward(input, *args, **kwargs)

SVDQuant layer forward function.