convert_hf_config
Convert modelopt quantization export config to align with llm-compressor config format.
Functions
Converts modelopt quantization config dictionary to align with llm-compressor config format. |
- convert_hf_quant_config_format(input_config)
Converts modelopt quantization config dictionary to align with llm-compressor config format.
- Parameters:
input_config (dict[str, Any]) – The original quantization config dictionary.
- Return type:
dict[str, Any]
Note
The “targets” field specifies which PyTorch module types to quantize. Compressed-tensors works with any PyTorch module type and uses dynamic matching against module.__class__.__name__. Typically this includes “Linear” modules, but can also include “Embedding” and other types.
See: https://github.com/neuralmagic/compressed-tensors/blob/fa6a48f1da6b47106912bcd25eba7171ba7cfec7/src/sparsetensors/quantization/quant_scheme.py#L29 Example usage: https://github.com/neuralmagic/compressed-tensors/blob/9938a6ec6e10498d39a3071dfd1c40e3939ee80b/tests/test_quantization/lifecycle/test_apply.py#L118
Example
{ "producer": {"name": "modelopt", "version": "0.19.0"}, "quantization": { "quant_algo": "FP8", "kv_cache_quant_algo": "FP8", "exclude_modules": ["lm_head"], }, }
- Returns:
A new dictionary in the target format.
Example (for FP8 input):
{ "config_groups": { "group_0": { "input_activations": {"dynamic": False, "num_bits": 8, "type": "float"}, "weights": {"dynamic": False, "num_bits": 8, "type": "float"}, } }, "ignore": ["lm_head"], "quant_algo": "FP8", "kv_cache_scheme": "FP8", "producer": {"name": "modelopt", "version": "0.29.0"}, }
- Parameters:
input_config (dict[str, Any])
- Return type:
dict[str, Any]