gemm

Modules

`modelopt.torch.kernels.quantization.gemm.fp4_kernel`	NVFP4 Fake Quantization Triton Implementation.
`modelopt.torch.kernels.quantization.gemm.fp4_kernel_hopper`	NVFP4 Fake Quantization Triton kernels requiring compute capability >= 8.9 (Hopper+).
`modelopt.torch.kernels.quantization.gemm.fp8_kernel`	FP8 Triton Kernel Implementations.
`modelopt.torch.kernels.quantization.gemm.gptq_fused_kernel`	Fused Triton kernels for GPTQ blockwise weight-update.
`modelopt.torch.kernels.quantization.gemm.nvfp4_quant`	Composable Triton JIT functions for NVFP4 (E2M1) fake quantization.

Triton quantization kernels.