no_op

No-op modules for replacing layers during pruning.

Classes

MatchingZeros

Module that returns zeros matching the input shape.

Same

Module that returns the input unchanged.

Functions

return_tuple_of_size

Create a wrapper class that returns a tuple of the given size.

class MatchingZeros

Bases: Module

Module that returns zeros matching the input shape.

Used to replace MLP or attention layers with no-ops. Returns zeros because the hidden_states are added to the residuals, so a no-op implementation should leave the residual unchanged.

forward(hidden_states, *args, **kwargs)
class Same

Bases: Module

Module that returns the input unchanged.

Used to replace normalization layers with identity operations.

forward(hidden_states, *args, **kwargs)
property weight

Support NemotronH with scoring_activations, when lm_head is called self.lm_head.weight.dtype.

return_tuple_of_size(cls, size)

Create a wrapper class that returns a tuple of the given size.

Useful for replacing modules that return multiple outputs (e.g., attention layers that return (hidden_states, attn_weights)).

Parameters:
  • cls (type[Module]) – The base module class to wrap.

  • size (int) – The size of the tuple to return.

Returns:

A new class that wraps the base class and returns a tuple of the given size.

Return type:

type[Module]

Example

>>> decoder_layer.self_attn = return_tuple_of_size(MatchingZeros, size=2)()