sparsification

High-level API to automatically sparsify your model with various algorithms.

Functions

sparsify

Sparsify a given model and search for they optimal sparsified weights.

export

Export a sparse dynamic model to a regular model.

export(model)

Export a sparse dynamic model to a regular model.

This should be done after the model is fine-tuned and the weights are fixed.

Warning

After the call to export(), the sparsity mask will no longer be enforced. This means any future weight updates would destroy the sparsity pattern. If you want to continue training, call export() after training is finished.

Parameters:

model (Module) –

Return type:

Module

sparsify(model, mode, config=None)

Sparsify a given model and search for they optimal sparsified weights.

Parameters:
  • model (Module) – A standard model that contains standard building blocks to be sparsified in-place.

  • mode (_ModeDescriptor | str | List[_ModeDescriptor | str] | List[Tuple[str, Dict[str, Any]]]) –

    A (list of) string(s) or Mode(s) or a list of tuples containing the mode and its config indicating the desired mode(s) (and configurations) for the convert process. Modes set up the model for different algorithms for model optimization. The following modes are available:

    • "sparse_magnitude": The model will be sparsified according to the magnitude of weights in each layer. The mode’s config is described in SparseMagnitudeConfig.

    • "sparsegpt": The model will be sparsified and weights are updated optimally using an Hessian approximation of the loss function (see SparseGPT paper for details). The mode’s config is described in SparseGPTConfig.

    If the mode argument is specified as a dictionary, the keys should indicate the mode and the values specify the per-mode configuration. If not provided, then default configuration would be used.

  • config (Dict[str, Any] | None) –

    Additional optional arguments to configure the search. Currently, we support:

    • verbose: Whether to print detailed search stats during search.

    • forward_loop: A Callable that takes a model as input and runs a forward loop

      on it. It is recommended to choose the data loader used inside the forward loop carefully to reduce the runtime. Cannot be provided at the same time as data_loader and collect_func.

    • data_loader: An iterator yielding batches of data for calibrating the normalization layers in the model or compute gradient scores. It is recommended to use the same data loader as for training but with significantly fewer iterations. Cannot be provided at the same time as forward_loop.

    • collect_func: A Callable that takes a batch of data from the data loader as input and returns the input to model.forward() as described in run_forward_loop. Cannot be provided at the same time as forward_loop.

    Note

    Additional configuration options may be added by individual algorithms. Please refer to the documentation of the individual algorithms for more information.

Return type:

Tuple[Module, Dict[str, Any]]

Returns: A sparsified model

Note

The given model is sparsified in-place. The returned model is thus a reference to the same model instance as the input model.