Pipelines and runners#
FlashDreams model integrations are built from two public layers:
Pipelines (
StreamInferencePipelineConfig) that define model behavior.Runners (
RunnerConfig+Runner) that define CLI-facing I/O.
Most actively developed model implementations now live under integrations/*
as plugin-style standalone packages. This page keeps documenting the in-tree
pipeline modules that are still exposed from flashdreams.recipes.
Note
Pipeline modules import the heavy GPU stack (transformer-engine, CUDA
ops) at import time, so this page shows them by automodule with
:no-undoc-members: to keep the rendered API focused on the names
that these in-tree modules actually expose. The unified flashdreams-run
CLI shows end-to-end usage; see Models for model launch
examples.
Integration structure (current)#
For new model work, follow integrations/<name>/:
config.py: pipeline + runner config literals (slugged entries).runner.py: runtime I/O, cache init, generate/finalize loop, persistence.pipeline.pyandtransformer/*: model compute path.pyproject.toml: plugin packaging + entry-point registration.
This makes each integration effectively a standalone repository while still
plugging into the same flashdreams-run registry.
Reference integration folders#
NVIDIA OmniDreams#
OmniDreams now ships as a plugin under integrations/omnidreams; it
registers its runners via the flashdreams.runner_configs entry-point
group and is no longer part of the in-tree flashdreams.recipes API
surface. See integrations/omnidreams/README.md for the plugin entry
point and flashdreams-run omnidreams-* for the user-facing CLI.