speculative
Modules
Configurations for speculative decoding modes. |
|
Medusa Optimization Method. |
|
This module contains the mode descriptor for the quantization mode. |
|
Handles speculative plugins for third-party modules. |
|
User-facing API for converting a model into a modelopt.torch.speculative.MedusaModel. |
Speculative Decoding Optimizations.