Pipelines and runners#

FlashDreams model integrations are built from two public layers:

  • Pipelines (StreamInferencePipelineConfig) that define model behavior.

  • Runners (RunnerConfig + Runner) that define CLI-facing I/O.

Most actively developed model implementations now live under integrations/* as plugin-style standalone packages. This page keeps documenting the in-tree pipeline modules that are still exposed from flashdreams.recipes.

Note

Pipeline modules import the heavy GPU stack (transformer-engine, CUDA ops) at import time, so this page shows them by automodule with :no-undoc-members: to keep the rendered API focused on the names that these in-tree modules actually expose. The unified flashdreams-run CLI shows end-to-end usage; see Models for model launch examples.

Integration structure (current)#

For new model work, follow integrations/<name>/:

  • config.py: pipeline + runner config literals (slugged entries).

  • runner.py: runtime I/O, cache init, generate/finalize loop, persistence.

  • pipeline.py and transformer/*: model compute path.

  • pyproject.toml: plugin packaging + entry-point registration.

This makes each integration effectively a standalone repository while still plugging into the same flashdreams-run registry.

Reference integration folders#

NVIDIA OmniDreams#

OmniDreams now ships as a plugin under integrations/omnidreams; it registers its runners via the flashdreams.runner_configs entry-point group and is no longer part of the in-tree flashdreams.recipes API surface. See integrations/omnidreams/README.md for the plugin entry point and flashdreams-run omnidreams-* for the user-facing CLI.

Wan#

TAEHV#