gemm
Modules
NVFP4 Fake Quantization Triton Implementation. |
|
NVFP4 Fake Quantization Triton kernels requiring compute capability >= 8.9 (Hopper+). |
|
FP8 Triton Kernel Implementations. |
|
Fused Triton kernels for GPTQ blockwise weight-update. |
|
Composable Triton JIT functions for NVFP4 (E2M1) fake quantization. |
Triton quantization kernels.