nf4_tensor
Implements NF4 quantization for efficient tensor storage and computation.
Classes
Implements the NF4 quantization on tensors for more efficient storage or computation. |
- class NF4QTensor
Bases:
BaseQuantizedTensor
Implements the NF4 quantization on tensors for more efficient storage or computation.
- quantized_data
The quantized data stored as a packed uint8 tensor.
- Type:
torch.Tensor
- dequantize(dtype=None, **kwarg)
Dequantze NF4 packed tensor to a target dtype.
- Parameters:
dtype (dtype) –
- classmethod double_quantization(scales, scale_block_size, num_scale_bits)
Perform double quantization on the scales.
Unlike the quantize method quantizing input data, this function quantizes float scales into int8 to further reduce memory usage of scales.
- Parameters:
scales (Tensor) –
scale_block_size (int) –
num_scale_bits (int) –
- classmethod quantize(input, block_size)
Converting a tensor to a quantized format based on NF4 double quantization.
- Parameters:
input (torch.Tensor) – The input tensor to be quantized.
block_size (int) – The size of each block for quantization.
scale_block_size (int) – The block size for scaling during quantization.
- Returns:
Contains quantized data, input quantization config, and scale quantization config.
- Return type:
tuple