diffusers_utils
Code that export quantized Hugging Face models for deployment.
Functions
Create a dummy forward function for diffusion(-like) models. |
|
Generate dummy inputs for diffusion model forward pass. |
|
Get all exportable components from a diffusion(-like) pipeline. |
|
Get all exportable components from a diffusion(-like) pipeline. |
|
Detect the diffusion model type for merge function dispatch. |
|
Extract the parent attention block path and QKV type for grouping. |
|
Context manager that temporarily removes quantizer modules from the model. |
|
Infer the dtype from a model's parameters. |
|
Return True if model is a diffusers pipeline/component or LTX-2 pipeline. |
|
Check if a module name corresponds to a QKV projection layer. |
|
Merge transformer weights with a base checkpoint and build ComfyUI metadata. |
- generate_diffusion_dummy_forward_fn(model)
Create a dummy forward function for diffusion(-like) models.
For diffusers components, this uses generate_diffusion_dummy_inputs() and calls model(**kwargs).
For LTX-2 stage-1 transformer (X0Model), the forward signature is model(video: Modality|None, audio: Modality|None, perturbations: BatchedPerturbationConfig), so we build tiny ltx_core dataclasses and call the model directly.
- Parameters:
model (Module)
- Return type:
Callable[[], None]
- generate_diffusion_dummy_inputs(model, device, dtype)
Generate dummy inputs for diffusion model forward pass.
Different diffusion models have very different input formats: - DiTTransformer2DModel: 4D hidden_states + class_labels - FluxTransformer2DModel: 3D hidden_states + encoder_hidden_states + img_ids + txt_ids + pooled_projections - SD3Transformer2DModel: 4D hidden_states + encoder_hidden_states + pooled_projections - UNet2DConditionModel: 4D sample + timestep + encoder_hidden_states - WanTransformer3DModel: 5D hidden_states + encoder_hidden_states + timestep
- Parameters:
model (Module) – The diffusion model component.
device (device) – Device to create tensors on.
dtype (dtype) – Data type for tensors.
- Returns:
Dictionary of dummy inputs, or None if model type is not supported.
- Return type:
dict[str, Tensor] | None
- get_diffusers_components(model, components=None)
Get all exportable components from a diffusion(-like) pipeline.
Supports: - diffusers DiffusionPipeline: returns pipeline.components - diffusers component nn.Module (e.g., UNet / transformer) - LTX-2 pipeline (duck-typed): returns stage-1 transformer only as stage_1_transformer
- Parameters:
model (Any) – The pipeline or component.
components (list[str] | None) – Optional list of component names to filter. If None, all components are returned.
- Returns:
Dictionary mapping component names to their instances (can be nn.Module, tokenizers, schedulers, etc.).
- Return type:
dict[str, Any]
- get_diffusion_components(model, components=None)
Get all exportable components from a diffusion(-like) pipeline.
Supports: - diffusers DiffusionPipeline: returns pipeline.components - diffusers component nn.Module (e.g., UNet / transformer) - LTX-2 pipeline (duck-typed): returns stage-1 transformer only as stage_1_transformer
- Parameters:
model (Any) – The pipeline or component.
components (list[str] | None) – Optional list of component names to filter. If None, all components are returned.
- Returns:
Dictionary mapping component names to their instances (can be nn.Module, tokenizers, schedulers, etc.).
- Return type:
dict[str, Any]
- get_diffusion_model_type(pipe)
Detect the diffusion model type for merge function dispatch.
To add a new model type, add a detection clause here and a corresponding merge function in
DIFFUSION_MERGE_FUNCTIONS.- Parameters:
pipe (Any) – The pipeline or component being exported.
- Returns:
A string key into
DIFFUSION_MERGE_FUNCTIONS.- Raises:
ValueError – If the model type is not supported.
- Return type:
str
- get_qkv_group_key(module_name)
Extract the parent attention block path and QKV type for grouping.
QKV projections should only be fused within the same attention block AND for the same type of attention (main vs added/cross).
Examples
‘transformer_blocks.0.attn.to_q’ -> ‘transformer_blocks.0.attn.main’
‘transformer_blocks.0.attn.to_k’ -> ‘transformer_blocks.0.attn.main’
‘transformer_blocks.5.attn.add_q_proj’ -> ‘transformer_blocks.5.attn.add’
‘transformer_blocks.5.attn.add_k_proj’ -> ‘transformer_blocks.5.attn.add’
- Parameters:
module_name (str) – The full module name path.
- Returns:
A string key representing the attention block and QKV type for grouping.
- Return type:
str
- hide_quantizers_from_state_dict(model)
Context manager that temporarily removes quantizer modules from the model.
This allows save_pretrained to save the model without quantizer buffers like _amax. The quantizers are restored after exiting the context.
- Parameters:
model (Module) – The model with quantizers to temporarily hide.
- Yields:
None - the model can be saved within the context.
- infer_dtype_from_model(model)
Infer the dtype from a model’s parameters.
- Parameters:
model (Module) – The model to infer dtype from.
- Returns:
The dtype of the model’s parameters, defaulting to float16 if no parameters found.
- Return type:
dtype
- is_diffusers_object(model)
Return True if model is a diffusers pipeline/component or LTX-2 pipeline.
- Parameters:
model (Any)
- Return type:
bool
- is_qkv_projection(module_name)
Check if a module name corresponds to a QKV projection layer.
In diffusers, QKV projections typically have names like: - to_q, to_k, to_v (most common in diffusers attention) - q_proj, k_proj, v_proj - query, key, value - add_q_proj, add_k_proj, add_v_proj (for additional attention in some models)
We exclude: - norm*.linear (AdaLayerNorm modulation layers) - proj_out, proj_mlp (output projections) - ff.*, mlp.* (feed-forward layers) - to_out (output projection)
- Parameters:
module_name (str) – The full module name path.
- Returns:
True if this is a QKV projection layer.
- Return type:
bool
- merge_diffusion_checkpoint(state_dict, merged_base_safetensor_path, model_type, hf_quant_config=None)
Merge transformer weights with a base checkpoint and build ComfyUI metadata.
Dispatches to the model-specific merge function in
DIFFUSION_MERGE_FUNCTIONSand, whenhf_quant_configis provided, embedsquantization_configand per-layer_quantization_metadatain the safetensors metadata for ComfyUI.- Parameters:
state_dict (dict[str, Tensor]) – The transformer state dict (already on CPU).
merged_base_safetensor_path (str) – Path to the full base model
.safetensorsfile containing all components (transformer, VAE, vocoder, etc.), e.g."path/to/ltx-2-19b-dev.safetensors".model_type (str) – Key into
DIFFUSION_MERGE_FUNCTIONSfor the model-specific merge.hf_quant_config (dict | None) – If provided, embed quantization config and per-layer
_quantization_metadatain the returned metadata dict.
- Returns:
Tuple of (merged_state_dict, metadata) where metadata is the base checkpoint’s original metadata augmented with any quantization entries.
- Return type:
tuple[dict[str, Tensor], dict[str, str]]