expert_removal_pruning_mixin
Classes
Descriptor for expert-removal pruning layers. |
|
- class ExpertRemovalLayerDescriptor
Bases:
LayerDescriptorDescriptor for expert-removal pruning layers.
- __init__(target_name, moe_prefix_name, expert_prefix_name='', router_weights=<factory>, router_biases=<factory>, expert_weights=<factory>, expert_biases=<factory>, is_fused_experts=False, fused_expert_weights=<factory>)
- Parameters:
target_name (str)
moe_prefix_name (str)
expert_prefix_name (str)
router_weights (List[str])
router_biases (List[str])
expert_weights (List[str])
expert_biases (List[str])
is_fused_experts (bool)
fused_expert_weights (List[str])
- Return type:
None
- expert_biases: List[str]
Per-expert bias names relative to expert_prefix (per-expert format).
- expert_prefix(layer_idx, expert_idx)
- Parameters:
layer_idx (int)
expert_idx (int)
- Return type:
str
- expert_prefix_name: str = ''
Expert prefix relative to moe_prefix with
{expert_idx}placeholder, e.g.experts.{expert_idx}.
- expert_weights: List[str]
Per-expert weight names relative to expert_prefix (per-expert format).
- fused_expert_weights: List[str]
Fused expert weight names relative to moe_prefix, e.g.
["experts.gate_up_proj", "experts.down_proj"].
- is_fused_experts: bool = False
If
True, experts are stored as single fused tensors (shape[num_experts, ...]).
- module_name_regex()
- Return type:
str
- moe_prefix(layer_idx)
- Parameters:
layer_idx (int)
- Return type:
str
- moe_prefix_name: str
MoE prefix layer name with
{layer_idx}placeholder, e.g.model.layers.{layer_idx}.moe.
- router_biases: List[str]
Router bias names relative to moe_prefix.
- router_weights: List[str]
Router weight names relative to moe_prefix.
- target_name: str
Module name for hook registration; supports
regex:prefix.
- class ExpertRemovalPruningMixIn
Bases:
PruningMixIn- __init__(layer_descriptor)
- Parameters:
layer_descriptor (ExpertRemovalLayerDescriptor)
- prune_single_layer(layer_idx, parent_state_dict, new_state_dict, original_config, new_config, mlp_init_mode, mlp_init_config, keys, **kwargs)
- Parameters:
layer_idx (int)
parent_state_dict (dict)
new_state_dict (dict)
original_config (PreTrainedConfig)
new_config (PreTrainedConfig)
mlp_init_mode (MlpInitMode)
mlp_init_config (dict[str, Any] | None)
keys (dict)
- Return type:
Dict[str, Tensor]
- supported_hooks()
- Return type:
List[Type[ForwardHook]]