unified_export_hf

Code that export quantized Hugging Face models for deployment.

Functions

Export quantized HuggingFace model checkpoint (transformers or diffusers).

export_hf_checkpoint(model, dtype=None, export_dir='/tmp', save_modelopt_state=False, components=None, extra_state_dict=None, **kwargs)

Export quantized HuggingFace model checkpoint (transformers or diffusers).

This function automatically detects whether the model is from transformers or diffusers and applies the appropriate export logic.

Parameters:

model (Any) – The full torch model to export. The actual quantized model may be a submodule. Supports both transformers models (e.g., LlamaForCausalLM) and diffusers models/pipelines (e.g., StableDiffusionPipeline, UNet2DConditionModel).
dtype (dtype | None) – The weights data type to export the unquantized layers or the default model data type if None.
export_dir (Path | str) – The target export path.
save_modelopt_state (bool) – Whether to save the modelopt state_dict.
components (list[str] | None) – Only used for diffusers pipelines. Optional list of component names to export. If None, all quantized components are exported.
extra_state_dict (dict[str, Tensor] | None) – Extra state dictionary to add to the exported model.
**kwargs – Internal-only keyword arguments. Supported key: merged_base_safetensor_path (str, optional). When provided, merges the exported diffusion transformer weights with non-transformer components (VAE, vocoder, text encoders, etc.) from this base safetensors file to produce a single-file checkpoint compatible with ComfyUI. Value should be the path to a full base model .safetensors file (e.g. "path/to/ltx-2-19b-dev.safetensors"). Only used for diffusion model exports.