config

Pydantic configuration classes for the fastgen distillation pipelines.

Configurations are layered so a method-specific config (e.g. DMDConfig) inherits shared diffusion-distillation hyperparameters from DistillationConfig. All classes inherit modelopt.torch.opt.config.ModeloptBaseConfig, which provides torch-safe serialization and dict-like iteration.

The default values in DMDConfig mirror the FastGen Wan 2.2 5B experiment at FastGen/fastgen/configs/experiments/WanT2V/config_dmd2_wan22_5b.py.

Classes

`DMDConfig`	Hyperparameters for DMD / DMD2 distribution-matching distillation.
`DistillationConfig`	Shared hyperparameters for diffusion step-distillation methods.
`EMAConfig`	Exponential moving average (EMA) hyperparameters for the student network.
`SampleTimestepConfig`	Timestep sampling distribution for diffusion training.

class DMDConfig

Bases: DistillationConfig

Hyperparameters for DMD / DMD2 distribution-matching distillation.

Default values are tuned for Wan 2.2 5B; callers fine-tune them per model. See FastGen/fastgen/configs/experiments/WanT2V/config_dmd2_wan22_5b.py.

backward_simulation: bool

ema: EMAConfig | None

fake_score_pred_type: PredType | None

classmethod from_yaml(config_file)

Construct a DMDConfig from a YAML file.

Thin wrapper around modelopt.torch.fastgen.loader.load_dmd_config(). The resolver searches the built-in modelopt_recipes/ package first, then the filesystem. Suffixes (.yml / .yaml) may be omitted.

Parameters:: config_file (str | Path)
Return type:: DMDConfig

gan_loss_weight_gen: float

gan_r1_reg_alpha: float

gan_r1_reg_weight: float

gan_use_same_t_noise: bool

model_config = {'extra': 'forbid', 'validate_assignment': True}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

student_update_freq: int

class DistillationConfig

Bases: ModeloptBaseConfig

Shared hyperparameters for diffusion step-distillation methods.

Concrete methods subclass this config to add method-specific fields (see DMDConfig).

guidance_scale: float | None

model_config = {'extra': 'forbid', 'validate_assignment': True}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_train_timesteps: int | None

pred_type: PredType

sample_t_cfg: SampleTimestepConfig

student_sample_steps: int

student_sample_type: Literal['sde', 'ode']

class EMAConfig

Bases: ModeloptBaseConfig

Exponential moving average (EMA) hyperparameters for the student network.

batch_size: int

decay: float

dtype: Literal['float32', 'bfloat16', 'float16'] | None

fsdp2: bool

gamma: float

halflife_kimg: float

mode: Literal['full_tensor', 'local_shard']

model_config = {'extra': 'forbid', 'validate_assignment': True}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

rampup_ratio: float | None

start_iter: int

type: Literal['constant', 'halflife', 'power']

class SampleTimestepConfig

Bases: ModeloptBaseConfig

Timestep sampling distribution for diffusion training.

max_t: float

min_t: float

model_config = {'extra': 'forbid', 'validate_assignment': True}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

p_mean: float

p_std: float

shift: float

t_list: list[float] | None

time_dist_type: TimeDistType