quantization
Modules
Quantization-specific attention kernel pieces (placeholder for combined sparse+quant path). |
|
Implicit-GEMM CUDA kernel for quantized 3D convolution. |
|
Triton quantization kernels. |
Quantization kernels: conv (implicit GEMM) and gemm (tensor_quant + Triton FP4/FP8).