calc_subblock_stats
Calc subblock stats to compute memory and runtime statistics for subblocks.
Functions
Launch the calc subblock stats function with Hydra configuration. |
|
- add_int8_runtime_estimates(subblock_stats)
- Parameters:
subblock_stats (list[dict])
- Return type:
None
- calculate_subblock_stats(calc_subblock_stats_config, teacher_dir, model_config, descriptor, master_puzzle_dir, subblock_configs, batch_size, prefill_seq_len, generation_seq_len, prefill_queue_size, n_embd, n_head, vocab_size, benchmark_iterations, use_cuda_graph, weights_dtype, activations_dtype, kv_cache_dtype, allocate_prefill_query, moe_stats_file=None)
- Parameters:
calc_subblock_stats_config (DictConfig)
teacher_dir (Path)
model_config (PreTrainedConfig)
descriptor (Type[ModelDescriptor])
master_puzzle_dir (Path)
subblock_configs (list[immutabledict[str, AttentionConfig | FFNConfig]])
batch_size (int)
prefill_seq_len (int)
generation_seq_len (int)
prefill_queue_size (int)
n_embd (int)
n_head (int)
vocab_size (int)
benchmark_iterations (int | None)
use_cuda_graph (bool)
weights_dtype (dtype)
activations_dtype (dtype)
kv_cache_dtype (dtype)
allocate_prefill_query (bool)
moe_stats_file (str | Path | None)
- Return type:
dict
- launch_calc_subblock_stats(cfg)
Launch the calc subblock stats function with Hydra configuration.
- Parameters:
cfg (DictConfig)
- Return type:
None