mode
Sparse attention mode descriptor for ModelOpt.
Classes
Mode descriptor for sparse attention optimization. |
- class SparseAttentionModeDescriptor
Bases:
ModeDescriptorMode descriptor for sparse attention optimization.
This mode enables various sparse attention methods to reduce computational complexity and memory usage in transformer models.
- property config_class: type[ModeloptBaseConfig]
Specifies the config class for the mode.
- property convert: Callable[[Module, ModeloptBaseConfig], tuple[Module, dict[str, Any]]] | Callable[[Module, ModeloptBaseConfig, Any], tuple[Module, dict[str, Any]]]
The mode’s entrypoint for converting a model.
- property export_mode: str | None
The mode that corresponds to the export mode of this mode.
- property name: str
Returns the value (str representation) of the mode.
- property next_prohibited_modes: set[str] | None
Modes that should not be applied after this mode.
- property restore: Callable[[Module, ModeloptBaseConfig, dict[str, Any]], Module]
The mode’s entrypoint for restoring a model.
- property update_for_new_mode: Callable[[Module, ModeloptBaseConfig, dict[str, Any]], None]
The mode’s entrypoint for updating the model’s state before new mode.
- property update_for_save: Callable[[Module, ModeloptBaseConfig, dict[str, Any]], None]
The mode’s entrypoint for updating the model’s state before saving.