Launch your first model

This page provides a minimal path for:

  1. Offline / batch-like long-rollout model inference with Self-Forcing.

  2. Online interactive world-model serving with LingBot-World.

Prerequisites

Complete the setup in Installation first.

Run Self-Forcing T2V offline inference

Launch an offline inference run using the Self-Forcing model:

uv run --project integrations/self_forcing \
    flashdreams-run self-forcing-wan2.1-t2v-1.3b-taehv \
    --total-blocks 7

First runs take several minutes (Triton autotuning + CUDA-graph warmup); subsequent runs finish in well under a minute. Output lands at outputs/self-forcing-wan2.1-t2v-1.3b-taehv.mp4 (16 FPS, 480×832 by default). See Self-Forcing for --total-blocks, measured runtimes, and multi-GPU guidance.

Run LingBot-World interactive server

Launch an interactive serving session using the LingBot-World model:

uv run --project integrations/lingbot \
    flashdreams-run lingbot-world-fast \
    --example-data True \
    --total-blocks 21

Next steps

Explore models:

  • Models - Browse all supported models, their specific launch commands, and configurations.

For developers: