base_qtensor

Base Class for Real Quantized Tensor.

Classes

BaseQuantizedTensor

Base class for quantized tensors, providing methods for quantization and dequantization.

QTensorWrapper

A wrapper class for quantized tensors to make them compatible with torch.nn.Parameter.

class BaseQuantizedTensor

Bases: object

Base class for quantized tensors, providing methods for quantization and dequantization.

This class should be subclassed to implement specific types of quantized tensors. It handles the storage of quantized data along with the necessary configurations and original attributes.

original_meta_tensor

Original meta to keep attributes of original tensors.

Type:

torch.Tensor

quantized_data

Storage for the quantized tensor data. Quantized_data dtype is customized per QuantizedTensor implementation.

Type:

torch.Tensor

__init__(original_meta_tensor, quantized_data)

Initialize data attributes.

Parameters:
  • original_meta_tensor (Tensor) –

  • quantized_data (Tensor) –

dequantize(dtype, **kwarg)

Converts the quantized tensor back to a standard torch.Tensor.

Returns:

The dequantized tensor.

Return type:

torch.Tensor

Parameters:

dtype (dtype) –

classmethod quantize(input, block_size)

Pack a fake torch.Tensor into a real quantized tensor.

Parameters:
  • fake_quant_tensor (torch.Tensor) – The fake quantized tensor.

  • input (Tensor) –

  • block_size (int) –

Returns:

A real quantized tensor, scales.

class QTensorWrapper

Bases: Parameter

A wrapper class for quantized tensors to make them compatible with torch.nn.Parameter.

Parameters:

qtensor (BaseQuantizedTensor) – The quantized tensor to be wrapped.

static __new__(cls, qtensor)

Create a new QTensorWrapper instance.

Parameters:

qtensor (BaseQuantizedTensor) –