misc

Classes

EmptyInitOnDevice

Functions

`calculate_kv_dim`	Calculate the key-value dimension for grouped-query attention.
`raise_unknown_subblock_config_error`	Raise an error for invalid subblock configuration types.
`sizeof_dtype`	Return the size in bytes of the given data type.
`load_json`	Load and parse a JSON file.
`solution_to_str`	Convert a list of block configurations to a human-readable string representation.
`block_config_to_str`	Convert a BlockConfig to a human-readable string representation.
`subblock_config_to_str`	Convert a subblock config (FFN, Attention, Mamba, or MoE) to string.

class EmptyInitOnDevice

Bases: TorchFunctionMode

__init__(device=None, dtype=None)

Create tensors with given device and dtype using uninitialized memory.

Parameters:

device – torch.device to work with.
dtype – torch.dtype to work with.

Example:

with EmptyInitOnDevice("cuda", dtype=torch.bfloat16):
    model = LLaMA(model_config)
model.load_state_dict(torch.load("llama-lit/7B/lit-llama.pth"))

block_config_to_str(block_config)

Convert a BlockConfig to a human-readable string representation.

TODO: Consider a better place for this function. :param block_config: BlockConfig dataclass or dict containing attention and ffn configs.

Returns:: Formatted string with attention and FFN information, or None if input is None.
Parameters:: block_config (BlockConfig | dict[str, Any] | None)
Return type:: str | None

calculate_kv_dim(num_key_value_heads, n_head, n_embd)

Calculate the key-value dimension for grouped-query attention.

Parameters:

num_key_value_heads (int) – Number of key-value heads.
n_head (int) – Total number of attention heads.
n_embd (int) – Embedding dimension.

Returns:

Combined dimension for key and value tensors (2 * num_key_value_heads * head_size).

Return type:

int

load_json(file_path)

Load and parse a JSON file.

TODO: Consider a better place for this function.

Parameters:: file_path (str) – Path to the JSON file to load.
Returns:: Parsed JSON data as a Python object, or None if the file doesn’t exist.

raise_unknown_subblock_config_error(subblock_config)

Raise an error for invalid subblock configuration types.

TODO: Consider a better place for this function. :param subblock_config: The invalid subblock configuration object.

Raises:: ValueError – Always raised with a message indicating the expected types.
Parameters:: subblock_config (Any)
Return type:: None

sizeof_dtype(dtype)

Return the size in bytes of the given data type.

TODO: Consider a better place for this function. :param dtype: PyTorch data type or custom type string (e.g., ‘nvfp4’).

Returns:: ‘nvfp4’ returns ~0.588 bytes.
Return type:: Size in bytes of the data type. Special case
Parameters:: dtype (dtype)

solution_to_str(block_configs)

Convert a list of block configurations to a human-readable string representation.

TODO: Consider a better place for this function. Better place for this and subsequent related function would be in __repr__ function in class BlockConfig so when we print it or do str(block_config), it automatically prints in this custom formatted string

Parameters:: block_configs (list[dict[str, Any] | BlockConfig]) – List of BlockConfig dataclasses or dicts containing layer configurations.
Returns:: Multi-line string with each block’s configuration on a separate line.
Return type:: str

subblock_config_to_str(subblock_config, subblock_name=None)

Convert a subblock config (FFN, Attention, Mamba, or MoE) to string.

Parameters:

subblock_config (FFNConfig | AttentionConfig | dict[str, Any] | None) – FFNConfig, AttentionConfig dataclass or dict.
subblock_name (None | str) – Name of subblock (‘ffn’, ‘attention’, ‘mamba’, ‘moe’). Auto-detected if subblock_config is a dataclass.

Returns:

Formatted string showing subblock type and key parameters (e.g., intermediate_size, num_key_value_heads), or None if input is None.

Return type:

str | None