kernels
Kernel integrations for sparse attention: Triton FA and diffusers/LTX backends.
Functions
Return whether skip-softmax softmax patching is active. |
|
Register |
|
Patch all |
|
Set whether skip-softmax softmax patching is active (thread-local). |
- get_skip_softmax_context()
Return whether skip-softmax softmax patching is active.
- Return type:
bool
- register_diffusers_triton_attention()
Register
modelopt_tritonbackend in diffusers.Safe to call multiple times; registration happens only once.
- Return type:
None
- register_ltx_triton_attention(model)
Patch all
ltx_core.Attentionmodules for Triton dispatch.- Parameters:
model (Module)
- Return type:
None
- set_skip_softmax_context(active)
Set whether skip-softmax softmax patching is active (thread-local).
- Parameters:
active (bool)
- Return type:
None