quantization

Modules

modelopt.torch.kernels.quantization.attention

Quantization-specific attention kernel pieces (placeholder for combined sparse+quant path).

modelopt.torch.kernels.quantization.conv

Implicit-GEMM CUDA kernel for quantized 3D convolution.

modelopt.torch.kernels.quantization.gemm

Triton quantization kernels.

Quantization kernels: conv (implicit GEMM) and gemm (tensor_quant + Triton FP4/FP8).