Troubleshooting#
Use this page for common first-run failures before opening an issue. Each entry lists the visible symptom, the most likely cause, and the next concrete step to try.
CUDA or PyTorch build mismatch#
Symptoms:
A CUDA extension fails to build or load.
interactive-drive --manifest example_world_model_perf.yamlexits instead of falling back to the default PyTorch path.Errors mention
nvcc, a CUDA version, a GPU architecture, or missing CUDA libraries.
Likely cause:
The OmniDreams perf manifest uses native DiT and LightVAE acceleration with
native_dit_acceleration: required. That path requires a source checkout,
git, a CUDA toolchain with nvcc matching the installed PyTorch build,
and a Blackwell-class GPU (SM 12.0) or newer.
Fix or next step:
Start with the non-perf OmniDreams launch in NVIDIA OmniDreams. If you need the perf manifest, prepare the pinned third-party sources first:
uv run --package flashdreams-omnidreams omnidreams-prepare --perf
Then verify that the machine has the required GPU and a CUDA toolchain that matches the PyTorch build before launching:
python -c "import torch; print(torch.__version__, torch.version.cuda)"
nvcc --version
Disk or cache exhaustion#
Symptoms:
A first run fails while downloading or loading checkpoints.
The process reports no space left on device, stops mid-load, or leaves a partial Hugging Face cache.
Output video or stats files are missing after an interrupted run.
Likely cause:
Model checkpoints and example assets are cached on first use. LingBot-World
downloads a checkpoint of about 70 GB under $HF_HOME and its docs recommend
keeping about 200 GB free for the model plus Hugging Face cache. Example data
and generated outputs also consume local disk under paths such as
assets/example_data/lingbot_world/<NN>/ and outputs/.
Fix or next step:
Check free space on both the repository filesystem and the Hugging Face cache
filesystem. If the default cache location is too small, point HF_HOME at a
larger volume before running the model:
export HF_HOME=/path/to/large/cache
Then rerun the same command so the downloader can reuse or repair the cache.
Model download or authentication failure#
Symptoms:
A download returns 401, 403, not found, or gated-repository errors.
OmniDreams scene or checkpoint downloads fail before the demo starts.
LingBot-World fails while fetching the model checkpoint from Hugging Face.
Likely cause:
Most model runs need Hugging Face authentication. OmniDreams requires an
HF_TOKEN with read access to the nvidia/omni-dreams-scenes dataset and
the nvidia/omni-dreams-models model repository. Other model pages also
document first-run downloads from Hugging Face.
Fix or next step:
Export a valid token in the same shell that launches FlashDreams:
export HF_TOKEN=<your-hf-token>
If the token is already set, confirm that the account behind it can open the model or dataset page referenced by the failing command, then rerun the FlashDreams command.
GPU out of memory#
Symptoms:
The run exits with
CUDA out of memoryor the Python process is killed during model load or generation.LingBot-World runs out of memory on a single GPU when using large
--total-blocksvalues.Multi-GPU commands fail after one or more ranks report memory pressure.
Likely cause:
The selected model, resolution, rollout length, or GPU count does not fit the available VRAM. The model pages list minimum VRAM expectations: OmniDreams is about 48 GB, Self-Forcing is about 24 GB, and LingBot-World is about 120 GB.
Fix or next step:
Use a smaller documented run first: reduce --total-blocks, lower
--pixel-height and --pixel-width where the model page exposes those
flags, or use the documented multi-GPU torchrun --nproc_per_node=<N>
launch. For LingBot-World, also try the documented efficient streaming preset
lingbot-world-fast-taehv-window15-sink3.
WebRTC connection or video does not appear#
Symptoms:
/request_sessionis not reachable from the browser.The page loads but video never appears.
The server seems idle before printing the connection URL.
Likely cause:
The WebRTC servers open their HTTP port only after model load and warmup. On a
remote or cloud GPU instance, the server port may not be reachable directly at
the host IP. If /request_session loads but video does not appear, the
browser may be hiding local IPs in WebRTC ICE candidates with mDNS hostnames.
Fix or next step:
Wait until the server prints Connect via http://<server-ip>:8089/request_session.
For remote machines, forward the documented port and open the local URL:
ssh -L 8089:localhost:8089 <user>@<host>
Then open http://localhost:8089/request_session. If the page loads but the
video still does not appear, follow the browser-specific WebRTC setting in
NVIDIA OmniDreams or LingBot-World.
Triton autotuning or warmup looks stuck#
Symptoms:
The first launch takes several minutes.
Logs mention Triton autotuning or CUDA-graph warmup.
Later runs are much faster than the first run.
Likely cause:
Cold runs include one-time setup. The quickstart and model pages document that first launches can include downloads, Triton autotuning, CUDA-graph warmup, and for OmniDreams native acceleration, first-use extension compilation.
Fix or next step:
Let the first launch finish if it is still making progress. Subsequent launches
reuse caches. For quick validation, use the small documented demo values such
as --total-blocks 7 for Self-Forcing or inspect a runner without loading the
model by using --no-instantiate as described below.
--no-instantiate prints a config but does not run#
Symptoms:
flashdreams-runprintsResolved config for ...and exits without downloading checkpoints, warming up, or writing an output video.No files appear under
outputs/.
Likely cause:
--no-instantiate is a diagnostic flag. It resolves and prints the runner
configuration, then returns before creating the runner or calling
runner.run().
Fix or next step:
Use --no-instantiate only when you want to confirm that a runner slug and
CLI overrides parse correctly:
uv run flashdreams-run --no-instantiate self-forcing-wan2.1-t2v-1.3b-taehv
Remove the flag for a real generation run, or use --help on the runner
slug to inspect all supported options.