Developer Guides#

Inference pipeline overview

The end-to-end computation flow: warmup, CUDA-graph capture, the autoregressive-step body, the ring-attention shard group, and finalize. The mental model the rest of the project assumes.

Inference pipeline overview
Config system

How every overridable field is surfaced as a CLI flag, how recipe defaults compose, and how to layer overrides on top.

Config system
Add a new method

The entry-point surface a new recipe ships against: what to subclass, what to register, and where the parity tests live.

Add a new method

Where these guides fit#

These guides are conceptual. For a specific recipe, see its per-model page under Models; for the per-symbol reference, see CLI and API Reference; and for the two-command path from install to a generated clip, see Get Started.