config
Configurations for speculative decoding modes.
Classes
DFlash config for block-wise parallel speculative decoding. |
|
Eagle config. |
|
Medusa config. |
- class DFlashConfig
Bases:
ModeloptBaseConfigDFlash config for block-wise parallel speculative decoding.
- dflash_architecture_config: dict
- dflash_block_size: int
- dflash_dpace_alpha: float
- dflash_export_rope_scaling: dict
- dflash_freeze_base_model: bool
- dflash_loss_decay_factor: float
- dflash_loss_objective: Literal['decay', 'dpace']
- dflash_mask_token_id: int | None
- dflash_num_anchors: int
- dflash_offline: bool
- dflash_report_acc: bool
- dflash_self_logit_distillation: bool
- dflash_use_torch_compile: bool
- model_config = {'extra': 'forbid', 'validate_assignment': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class EagleConfig
Bases:
ModeloptBaseConfigEagle config.
- eagle_architecture_config: dict
- eagle_base_lora: bool
- eagle_base_lora_alpha: float
- eagle_base_lora_logits_detach_prob: float
- eagle_base_lora_preservation_loss_weight: float
- eagle_base_lora_rank: int
- eagle_base_lora_start_layer: int | None
- eagle_base_lora_target_modules: list | None
- eagle_base_lora_warmup_steps: int
- eagle_decoder_type: str
- eagle_enable_nvtx: bool
- eagle_export_rope_scaling: dict
- eagle_freeze_base_model: bool
- eagle_loss_decay_factor: float
- eagle_offline: bool
- eagle_report_acc: bool
- eagle_reuse_base_decoder: bool
- eagle_self_logit_distillation: bool
- eagle_ttt_steps: int
- eagle_use_torch_compile: bool
- model_config = {'extra': 'forbid', 'validate_assignment': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class MedusaConfig
Bases:
ModeloptBaseConfigMedusa config.
- medusa_num_heads: int
- medusa_num_layers: int
- model_config = {'extra': 'forbid', 'validate_assignment': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].