nvfp4_tensor
Implements NVFP4 quantization for efficient tensor storage and computation.
Classes
Not implemented. |
- class NVFP4QTensor
Bases:
BaseQuantizedTensor
Not implemented.
- dequantize(dtype=torch.float16, **kwarg)
Not implemented.
- Parameters:
dtype (dtype) –
- classmethod get_activation_scaling_factor(quantizer)
Not implemented.
- classmethod get_weights_scaling_factor(input, block_size, weights_scaling_factor_2=None, keep_high_precision=False)
Not implemented.
- Parameters:
input (Tensor) –
block_size (int) –
weights_scaling_factor_2 (Tensor | None) –
keep_high_precision (bool) –
- classmethod get_weights_scaling_factor_2(input)
Not implemented.
- Parameters:
input (Tensor) –
- classmethod quantize(input, block_size, weights_scaling_factor=None, weights_scaling_factor_2=None, keep_high_precision=False)
Not implemented.
- Parameters:
input (Tensor) –
block_size (int) –
weights_scaling_factor (Tensor | None) –
weights_scaling_factor_2 (Tensor | None) –
keep_high_precision (bool) –
- classmethod resmooth_weights_and_get_scales(merged_weights, pre_quant_scales, ranks, group_size, avg_pre_quant_scale=None)
Not implemented.
- Parameters:
merged_weights (Tensor) –
pre_quant_scales (List[Tensor]) –
ranks (int) –
group_size (int) –
avg_pre_quant_scale (Tensor) –