checkpoint_utils_hf

Utilities for loading and saving Hugging Face-format checkpoints (AutoConfig + optional block_configs).

Functions

force_cache_dynamic_modules

init_model_from_config

Build a model from config on meta/uninitialized weights (used e.g. for subblock param counts).

load_model_config

Load model configuration from a checkpoint directory.

save_checkpoint

save_model_config

save_subblocks

force_cache_dynamic_modules(config, checkpoint_dir, trust_remote_code=False)
Parameters:
  • config (PretrainedConfig)

  • checkpoint_dir (Path | str)

  • trust_remote_code (bool)

init_model_from_config(config, *, trust_remote_code=False, **kwargs)

Build a model from config on meta/uninitialized weights (used e.g. for subblock param counts).

trust_remote_code defaults to False (only AutoModelForCausalLM.from_config uses it). Pass True when loading configs that rely on custom modeling code from the checkpoint.

Parameters:
  • config (PretrainedConfig)

  • trust_remote_code (bool)

Return type:

PreTrainedModel

load_model_config(checkpoint_dir, model_config_overrides=None, ignore_unexpected_config_keys=False, trust_remote_code=False)

Load model configuration from a checkpoint directory.

Parameters:
  • checkpoint_dir (Path | str) – Path to the checkpoint directory (e.g. containing config.json).

  • model_config_overrides (Mapping | None) – Optional mapping of config overrides.

  • ignore_unexpected_config_keys (bool) – If True, ignore unexpected config keys.

  • trust_remote_code (bool) – If True, allows execution of custom code from the model repository. This is a security risk if the model source is untrusted. Only set to True if you trust the source of the model. Defaults to False for security.

Returns:

Loaded model configuration (PretrainedConfig).

save_checkpoint(model, checkpoint_dir, descriptor)
Parameters:
  • model (PreTrainedModel)

  • checkpoint_dir (Path | str)

  • descriptor (ModelDescriptor)

Return type:

None

save_model_config(model_config, checkpoint_dir)
Parameters:
  • model_config (PretrainedConfig)

  • checkpoint_dir (Path | str)

Return type:

None

save_subblocks(state_dict, checkpoint_dir, weight_map=None, multi_threaded=True, max_workers=None)
Parameters:
  • state_dict (dict[str, Tensor])

  • checkpoint_dir (Path | str)

  • weight_map (dict[str, str] | None)

  • multi_threaded (bool)

  • max_workers (int | None)

Return type:

None