Developer Guides#
Inference pipeline overview
The end-to-end computation flow: warmup, CUDA-graph capture, the autoregressive-step body, the ring-attention shard group, and finalize. The mental model the rest of the project assumes.
Config system
How every overridable field is surfaced as a CLI flag, how recipe defaults compose, and how to layer overrides on top.
Add a new method
The entry-point surface a new recipe ships against: what to subclass, what to register, and where the parity tests live.
Where these guides fit#
These guides are conceptual. For a specific recipe, see its per-model page under Models; for the per-symbol reference, see CLI and API Reference; and for the two-command path from install to a generated clip, see Get Started.