nvfp4_tensor

Implements NVFP4 quantization for efficient tensor storage and computation.

Classes

NVFP4QTensor

Not implemented.

class NVFP4QTensor

Bases: BaseQuantizedTensor

Not implemented.

dequantize(dtype=torch.float16, **kwarg)

Not implemented.

Parameters:

dtype (dtype) –

classmethod get_activation_scaling_factor(quantizer)

Not implemented.

classmethod get_weights_scaling_factor(input, block_size, weights_scaling_factor_2=None, keep_high_precision=False)

Not implemented.

Parameters:
  • input (Tensor) –

  • block_size (int) –

  • weights_scaling_factor_2 (Tensor | None) –

  • keep_high_precision (bool) –

classmethod get_weights_scaling_factor_2(input)

Not implemented.

Parameters:

input (Tensor) –

classmethod quantize(input, block_size, weights_scaling_factor=None, weights_scaling_factor_2=None, keep_high_precision=False)

Not implemented.

Parameters:
  • input (Tensor) –

  • block_size (int) –

  • weights_scaling_factor (Tensor | None) –

  • weights_scaling_factor_2 (Tensor | None) –

  • keep_high_precision (bool) –

classmethod resmooth_weights_and_get_scales(merged_weights, pre_quant_scales, ranks, group_size, avg_pre_quant_scale=None)

Not implemented.

Parameters:
  • merged_weights (Tensor) –

  • pre_quant_scales (List[Tensor]) –

  • ranks (int) –

  • group_size (int) –

  • avg_pre_quant_scale (Tensor) –