checkpoint_utils
Utilities for loading and initializing PyTorch model checkpoints (AnyModel / HF layouts).
Functions
Prefer loading the tokenizer from huggingface hub (when tokenizer_name.txt file is available) to avoid collision between transformers versions. |
|
True if the checkpoint config loads and defines |
|
Heavily inspired by torch.nn.utils.skip_init but does not require the module to accept a "device" kwarg. |
- copy_tokenizer(source_dir_or_tokenizer_name, target_dir, on_failure='raise')
Prefer loading the tokenizer from huggingface hub (when tokenizer_name.txt file is available) to avoid collision between transformers versions.
- Parameters:
source_dir_or_tokenizer_name (Path | str)
target_dir (Path | str)
on_failure (Literal['raise', 'warn'])
- Return type:
None
- init_empty_module(module_cls, dtype, *init_args, **init_kwargs)
- Parameters:
module_cls (type[NNModule])
dtype (dtype)
- Return type:
NNModule
- init_module_with_state_dict(state_dict, module_cls, *init_args, **init_kwargs)
- Parameters:
state_dict (dict[str, Tensor])
module_cls (type[NNModule])
- Return type:
NNModule
- is_valid_decilm_checkpoint(checkpoint_dir, trust_remote_code=False)
True if the checkpoint config loads and defines
block_configs(AnyModel / puzzletron layout).- Parameters:
checkpoint_dir (Path | str) – Path to checkpoint directory
trust_remote_code (bool) – If True, allows execution of custom code from the model repository. This is a security risk if the model source is untrusted. Only set to True if you trust the source of the model. Defaults to False for security.
- Returns:
True if the config has
block_configs, False otherwise- Return type:
bool
- load_state_dict(checkpoint_dir)
- Parameters:
checkpoint_dir (Path | str)
- Return type:
dict[str, Tensor]
- skip_init(module_cls, *args, **kwargs)
Heavily inspired by torch.nn.utils.skip_init but does not require the module to accept a “device” kwarg.
- Return type:
Module