anymodel

Modules

`modelopt.torch.puzzletron.anymodel.converter`	Converters for transforming HuggingFace models to AnyModel format.
`modelopt.torch.puzzletron.anymodel.model_descriptor`	Model descriptors for defining model-specific properties and layer naming conventions.
`modelopt.torch.puzzletron.anymodel.models`
`modelopt.torch.puzzletron.anymodel.puzzformer`	Utilities for patching and transforming HuggingFace models to work with AnyModel.

AnyModel: Architecture-agnostic model compression for HuggingFace models.

This module provides a declarative approach to model compression that works with any HuggingFace model without requiring custom modeling code. Instead of duplicating HuggingFace modeling classes, AnyModel uses ModelDescriptors that define:

Which decoder layer class(es) to patch for heterogeneous configs
How to map BlockConfig to layer-specific overrides
Weight name patterns for subblock checkpointing

Example usage:

>>> from modelopt.torch.puzzletron.anymodel import convert_model
>>> convert_model(
...     input_dir="path/to/hf_checkpoint",
...     output_dir="path/to/anymodel_checkpoint",
...     converter="llama",
... )

Supported models:

llama: Llama 2, Llama 3, Llama 3.1, Llama 3.2
(more to come: qwen2, mistral_small, etc.)