Developer Guides#
These guides cover how the system is structured underneath the CLI: the inference pipeline a recipe runs through, the configuration layer every recipe shares, the integration surface for adding a new method, common patterns for driving the pipeline from Python, and the shape of an interactive serving session. They are conceptual; the API reference is the per-symbol reference.
The pipeline overview is the anchor for the rest. The config system is the layer every recipe shares; new integrations sit on top of both. Usage patterns and interactive serving describe how the pipeline is embedded in surrounding code.
The end-to-end computation flow: warmup, CUDA-graph capture, the autoregressive-step body, the ring-attention shard group, and finalize. The mental model the rest of the project assumes.
How every overridable field is surfaced as a CLI flag, how recipe defaults compose, and how to layer overrides on top.
The entry-point surface a new recipe ships against: what to subclass, what to register, and where the parity tests live.
Common ways to drive FlashDreams from Python: the CLI, the in-process runner API, and the pipeline-level surface for embedding.
Keeping a streaming session alive: warmup, steady-state
generation, and how the WebRTC and gRPC servers under
integrations/ wire the pipeline up.
Where these guides fit#
Working forward from a recipe, start with the pipeline overview, then read the recipe’s per-model page under Models, then drop into the matching module under API reference for the implementation details. The Get Started covers the two-command path from install to a generated clip.