model_config_utils
Common utils for the ModelConfig.
Functions
Merges the qkv fields in model_config from QKVConfig to a single LinearConfig. |
|
Merges the qkv fields in model_config from QKVConfig to a single LinearConfig. |
|
Load a dict to a ModelConfig instance. |
|
Converts the instance to a python dict. |
|
Generates a constant scaling factor (1) with target quantization. |
|
Packs the quantized linear weights in the model_config to the quantized format. |
|
Returns the padded weights to tp_size. |
|
Recursively restores the model_config from json and loads np.ndarray or torch.Tensor weights from weights. |
|
Util function to split the weights or any torch.Tensor in nested config to weights. |
- merge_fc1_gate(model_config)
Merges the qkv fields in model_config from QKVConfig to a single LinearConfig.
- merge_qkv(model_config)
Merges the qkv fields in model_config from QKVConfig to a single LinearConfig.
- model_config_from_dict(d)
Load a dict to a ModelConfig instance.
- Parameters:
d (dict) –
- Return type:
- model_config_to_dict(model_config)
Converts the instance to a python dict.
- Parameters:
model_config (ModelConfig) –
- Return type:
dict
- naive_quantization(config)
Generates a constant scaling factor (1) with target quantization.
This is for debugging and performance measurement only.
- Parameters:
config (ModelConfig) –
- pack_linear_weights(model_config)
Packs the quantized linear weights in the model_config to the quantized format.
- Parameters:
model_config (ModelConfig) –
- pad_weights(weights, tp_size)
Returns the padded weights to tp_size.
- restore_model_config(model_config, weights)
Recursively restores the model_config from json and loads np.ndarray or torch.Tensor weights from weights.
- Parameters:
weights (Dict[str, ndarray | Tensor]) –
- split_config_and_weights(config, weights, prefix='transformer', layer_config_dict={})
Util function to split the weights or any torch.Tensor in nested config to weights.
A weight id starts with transformers or lm_head will also be generated to link the original key to the weights dict. The weights in the weights dict are contiguous.
layer_config_dict: A dictionary containing layerwise quantization format information and awq_block_size information when relevant. It is used to export quantization.json for auto_quant checkpoint.
- Parameters:
weights (Dict[str, tensor]) –
prefix (str) –
layer_config_dict (dict) –