nvfp4_quant
Composable Triton JIT functions for NVFP4 (E2M1) fake quantization.
- Single source of truth for FP4 decision-boundary rounding. Used by:
fp4_kernel.py(standalone blockwise fake quant)fp4_kernel_hopper.py(Hopper block-pointer variant)gptq_fused_kernel.py(fused GPTQ scalar path)
FP4 (E2M1) representable magnitudes: {0.0, 0.5, 1.0, 1.5, 2.0, 3.0, 4.0, 6.0}