diffusers_utils

Code that export quantized Hugging Face models for deployment.

Functions

`generate_diffusion_dummy_forward_fn`	Create a dummy forward function for diffusion(-like) models.
`generate_diffusion_dummy_inputs`	Generate dummy inputs for diffusion model forward pass.
`get_diffusers_components`	Get all exportable components from a diffusion(-like) pipeline.
`get_diffusion_components`	Get all exportable components from a diffusion(-like) pipeline.
`get_diffusion_model_type`	Detect the diffusion model type for merge function dispatch.
`get_qkv_group_key`	Extract the parent attention block path and QKV type for grouping.
`hide_quantizers_from_state_dict`	Context manager that temporarily removes quantizer modules from the model.
`infer_dtype_from_model`	Infer the dtype from a model's parameters.
`is_diffusers_object`	Return True if model is a diffusers pipeline/component or LTX-2 pipeline.
`is_qkv_projection`	Check if a module name corresponds to a QKV projection layer.
`merge_diffusion_checkpoint`	Merge transformer weights with a base checkpoint and build ComfyUI metadata.

generate_diffusion_dummy_forward_fn(model)

Create a dummy forward function for diffusion(-like) models.

For diffusers components, this uses generate_diffusion_dummy_inputs() and calls model(**kwargs).
For LTX-2 stage-1 transformer (X0Model), the forward signature is model(video: Modality|None, audio: Modality|None, perturbations: BatchedPerturbationConfig), so we build tiny ltx_core dataclasses and call the model directly.

Parameters:: model (Module)
Return type:: Callable[[], None]

generate_diffusion_dummy_inputs(model, device, dtype)

Generate dummy inputs for diffusion model forward pass.

Different diffusion models have very different input formats: - DiTTransformer2DModel: 4D hidden_states + class_labels - FluxTransformer2DModel: 3D hidden_states + encoder_hidden_states + img_ids + txt_ids + pooled_projections - SD3Transformer2DModel: 4D hidden_states + encoder_hidden_states + pooled_projections - UNet2DConditionModel: 4D sample + timestep + encoder_hidden_states - WanTransformer3DModel: 5D hidden_states + encoder_hidden_states + timestep

Parameters:

model (Module) – The diffusion model component.
device (device) – Device to create tensors on.
dtype (dtype) – Data type for tensors.

Returns:

Dictionary of dummy inputs, or None if model type is not supported.

Return type:

dict[str, Tensor] | None

get_diffusers_components(model, components=None)

Get all exportable components from a diffusion(-like) pipeline.

Supports: - diffusers DiffusionPipeline: returns pipeline.components - diffusers component nn.Module (e.g., UNet / transformer) - LTX-2 pipeline (duck-typed): returns stage-1 transformer only as stage_1_transformer

Parameters:

model (Any) – The pipeline or component.
components (list[str] | None) – Optional list of component names to filter. If None, all components are returned.

Returns:

Dictionary mapping component names to their instances (can be nn.Module, tokenizers, schedulers, etc.).

Return type:

dict[str, Any]

get_diffusion_components(model, components=None)

Get all exportable components from a diffusion(-like) pipeline.

Supports: - diffusers DiffusionPipeline: returns pipeline.components - diffusers component nn.Module (e.g., UNet / transformer) - LTX-2 pipeline (duck-typed): returns stage-1 transformer only as stage_1_transformer

Parameters:

model (Any) – The pipeline or component.
components (list[str] | None) – Optional list of component names to filter. If None, all components are returned.

Returns:

Dictionary mapping component names to their instances (can be nn.Module, tokenizers, schedulers, etc.).

Return type:

dict[str, Any]

get_diffusion_model_type(pipe)

Detect the diffusion model type for merge function dispatch.

To add a new model type, add a detection clause here and a corresponding merge function in DIFFUSION_MERGE_FUNCTIONS.

Parameters:: pipe (Any) – The pipeline or component being exported.
Returns:: A string key into DIFFUSION_MERGE_FUNCTIONS.
Raises:: ValueError – If the model type is not supported.
Return type:: str

get_qkv_group_key(module_name)

Extract the parent attention block path and QKV type for grouping.

QKV projections should only be fused within the same attention block AND for the same type of attention (main vs added/cross).

Examples

‘transformer_blocks.0.attn.to_q’ -> ‘transformer_blocks.0.attn.main’
‘transformer_blocks.0.attn.to_k’ -> ‘transformer_blocks.0.attn.main’
‘transformer_blocks.5.attn.add_q_proj’ -> ‘transformer_blocks.5.attn.add’
‘transformer_blocks.5.attn.add_k_proj’ -> ‘transformer_blocks.5.attn.add’

Parameters:: module_name (str) – The full module name path.
Returns:: A string key representing the attention block and QKV type for grouping.
Return type:: str

hide_quantizers_from_state_dict(model)

Context manager that temporarily removes quantizer modules from the model.

This allows save_pretrained to save the model without quantizer buffers like _amax. The quantizers are restored after exiting the context.

Parameters:: model (Module) – The model with quantizers to temporarily hide.
Yields:: None - the model can be saved within the context.

infer_dtype_from_model(model)

Infer the dtype from a model’s parameters.

Parameters:: model (Module) – The model to infer dtype from.
Returns:: The dtype of the model’s parameters, defaulting to float16 if no parameters found.
Return type:: dtype

is_diffusers_object(model)

Return True if model is a diffusers pipeline/component or LTX-2 pipeline.

Parameters:: model (Any)
Return type:: bool

is_qkv_projection(module_name)

Check if a module name corresponds to a QKV projection layer.

In diffusers, QKV projections typically have names like: - to_q, to_k, to_v (most common in diffusers attention) - q_proj, k_proj, v_proj - query, key, value - add_q_proj, add_k_proj, add_v_proj (for additional attention in some models)

We exclude: - norm*.linear (AdaLayerNorm modulation layers) - proj_out, proj_mlp (output projections) - ff.*, mlp.* (feed-forward layers) - to_out (output projection)

Parameters:: module_name (str) – The full module name path.
Returns:: True if this is a QKV projection layer.
Return type:: bool

merge_diffusion_checkpoint(state_dict, merged_base_safetensor_path, model_type, hf_quant_config=None)

Merge transformer weights with a base checkpoint and build ComfyUI metadata.

Dispatches to the model-specific merge function in DIFFUSION_MERGE_FUNCTIONS and, when hf_quant_config is provided, embeds quantization_config and per-layer _quantization_metadata in the safetensors metadata for ComfyUI.

Parameters:

state_dict (dict[str, Tensor]) – The transformer state dict (already on CPU).
merged_base_safetensor_path (str) – Path to the full base model .safetensors file containing all components (transformer, VAE, vocoder, etc.), e.g. "path/to/ltx-2-19b-dev.safetensors".
model_type (str) – Key into DIFFUSION_MERGE_FUNCTIONS for the model-specific merge.
hf_quant_config (dict | None) – If provided, embed quantization config and per-layer _quantization_metadata in the returned metadata dict.

Returns:

Tuple of (merged_state_dict, metadata) where metadata is the base checkpoint’s original metadata augmented with any quantization entries.

Return type:

tuple[dict[str, Tensor], dict[str, str]]