Launch your first model¶

This page provides a minimal path for:

Offline / batch-like long-rollout model inference with Self-Forcing.
Online interactive world-model serving with LingBot-World.

Prerequisites¶

Complete the setup in Installation first.

Run Self-Forcing T2V offline inference¶

Launch an offline inference run using the Self-Forcing model:

uv run --project integrations/self_forcing \
    flashdreams-run self-forcing-wan2.1-t2v-1.3b-taehv \
    --total-blocks 7

First runs take several minutes (Triton autotuning + CUDA-graph warmup); subsequent runs finish in well under a minute. Output lands at outputs/self-forcing-wan2.1-t2v-1.3b-taehv.mp4 (16 FPS, 480×832 by default). See Self-Forcing for --total-blocks, measured runtimes, and multi-GPU guidance.

Run LingBot-World interactive server¶

Launch an interactive serving session using the LingBot-World model:

uv run --project integrations/lingbot \
    flashdreams-run lingbot-world-fast \
    --example-data True \
    --total-blocks 21

Next steps¶

Explore models:

Models - Browse all supported models, their specific launch commands, and configurations.

For developers:

Inference pipeline overview - Learn about the system architecture and generation loop.
Config system - Understand how to modify pipeline and runner configurations.
Add a new method - Guide to adding your own custom models and methods.
API reference - Check the Python API and CLI references.