NVIDIA OmniDreams#
OmniDreams is a HDMap-conditioned world model for single-view and multi-view driving generation, with presets that balance visual fidelity and runtime throughput.
Teaser video source: OmniDreams project page.
Requirements#
Minimum VRAM: ~48 GB.
PyTorch: >= 2.11.
Installation#
# from the repo root
uv sync --project integrations/omnidreams
Running the method#
To run OmniDreams, launch one of the registered runner slugs. For example:
uv run --project integrations/omnidreams \
flashdreams-run \
omnidreams-sv-2steps-chunk2-loc6-lightvae-lighttae-perf \
--example-data True \
--example_data_uuid "239560dc-33d1-11ef-9720-00044bcbccac" \
--total-blocks 20
Sample example-data UUIDs for the inference script are available in the nvidia/omni-dreams-samples Hugging Face dataset.
We provide the following variants:
Method |
Description |
|---|---|
|
Single-view 2-step HDMap-conditioned I2V. |
For multi-GPU inference, use:
uv run --project integrations/omnidreams \
torchrun --nproc_per_node=4 --no-python flashdreams-run \
omnidreams-sv-2steps-chunk2-loc6-lightvae-lighttae-perf \
--example-data True \
--example_data_uuid "239560dc-33d1-11ef-9720-00044bcbccac" \
--total-blocks 20
To inspect all supported CLI arguments and their default values, run:
uv run --project integrations/omnidreams \
flashdreams-run \
omnidreams-sv-2steps-chunk2-loc6-lightvae-lighttae-perf \
--help
Some generated samples from the above commands:
Launch the interactive demo#
interactive-drive runs the OmniDreams single-view pipeline in a
single process and streams the camera view to your browser. The demo
machine only needs a CUDA-capable GPU – no graphics-capable GPU,
display server, or Vulkan support are required.
The demo requires access to NVIDIA/flashdreams
and an HF_TOKEN with read access to
nvidia/omni-dreams-scenes
(scene USDZs) and
nvidia/omni-dreams-models
(checkpoints).
First-time setup:
git clone https://github.com/NVIDIA/flashdreams.git
cd flashdreams
export HF_TOKEN=<your-hf-token>
uv sync --package flashdreams-omnidreams --extra interactive-drive
Optionally, pre-download scenes and checkpoints so the first launch isn’t blocked on network I/O:
uv run --package flashdreams-omnidreams omnidreams-prepare
Run the demo and stream to your browser:
uv run --package flashdreams-omnidreams interactive-drive --stream-mjpeg :8080
Then open http://<server-ip>:8080/ in any browser on the same
network and pick a scene from the picker in the bottom-right.
Note
The first launch is slow. The first time you start the demo, the world
model spends several minutes in a one-time optimization pass – checkpoint
loading, torch.compile / CUDA-graph capture, and Triton autotuning –
before the view becomes interactive. The on-screen indicator shows
Loading world model... during warmup and then Optimizing world
model... while the first generated chunk is autotuned; this phase is
longest on the perf manifest. Subsequent launches are much faster because
the compiled kernels and CUDA graphs are cached and reused.
Note
Add --offload-text-encoder to reduce peak VRAM usage by ~15 GB:
uv run --package flashdreams-omnidreams interactive-drive \
--stream-mjpeg :8080 \
--offload-text-encoder
The text and first-frame encoders are run once per scene and freed before the diffusion pipeline is built, and the resulting embeddings are cached and reused across world-model resets.
Trade-off: the world model is rebuilt on each scene load instead of staying resident, so the first load and scene/variant switches are slower. Prefer it when VRAM-constrained; otherwise leave it off for faster switching.
For execution using a consumer NVIDIA GPU that exposes a graphics stack,
omit the --stream-mjpeg flag to open the demo in a local Vulkan window
instead:
uv run --package flashdreams-omnidreams interactive-drive
The local window’s HUD adds a weather-variant selector (clear, rain, snow) next to the scene picker, so the same scene can be switched between conditions.
Note
The local window requires a display server and the system OpenGL / Vulkan client libraries. On Debian/Ubuntu:
sudo apt install -y libx11-6 libxcb1 libgl1 libglx-mesa0 libvulkan1
A Failed to initialize GLFW error indicates the display or one of these
libraries are missing.
Steering wheel and game controller#
A steering wheel or game controller can be used to control the local window mode. Any device that Ubuntu detects as a standard game controller or joystick is viable. We provide a configuration tool to calibrate these:
uv run --package flashdreams-omnidreams interactive-drive-configuration
The demo auto-loads your default profile on subsequent launches. When you
have more than one profile, the configuration tool’s start screen lists them
with Make default (plus Edit and Delete) buttons – re-run the tool to
choose which profile interactive-drive loads by default, tweak a profile
(steering sensitivity, deadzone, buttons, force feedback), or remove one.
Multiple devices. A profile can bind controls across several devices – for example a wheel base plus a separately-connected or different-brand pedal set. Ctrl+click to select more than one device on the configuration tool’s device page; each control binds to whichever selected device it moves on.
Force feedback. The method is auto-detected per wheel: a driver-managed
autocenter spring (Thrustmaster, Logitech) or a self-rendered constant force
(Fanatec, which has no autocenter). FFB needs the vendor’s Linux driver and
write access to /dev/input/* (add your user to the input group):
Vendor |
Driver |
|---|---|
Thrustmaster |
Out-of-tree hid-tmff2 plus a
wheel-mode init ( |
Fanatec |
hid-fanatecff with the base in PC mode (CSL DD, ClubSport, Podium, DD Pro). |
Logitech |
In-kernel |
Native acceleration (perf manifest)#
The bundled example_world_model_perf.yaml manifest runs the DiT and
LightVAE through the OmniDreams single-view CUDA extension
(native_dit_acceleration: required), which is faster than the default
PyTorch path. The extension builds against pinned checkouts of CUTLASS,
SageAttention, SpargeAttn, and cudnn-frontend that are not vendored in the
repo. omnidreams-prepare --perf clones them at their pinned commits into
integrations/omnidreams/omnidreams_singleview/3rdparty/:
uv run --package flashdreams-omnidreams omnidreams-prepare --perf
This step only syncs sources; the extension itself compiles on the first
launch that uses the manifest (one-time, a few minutes). It requires a
Blackwell-class GPU (SM 12.0) or newer, a source checkout (the
omnidreams_singleview sources ship only in the git tree, not the wheel),
git, and a CUDA toolchain (nvcc) matching your PyTorch build. Then
point the demo at the perf manifest:
uv run --package flashdreams-omnidreams interactive-drive \
--manifest example_world_model_perf.yaml
native_dit_acceleration: required makes the manifest fail loudly if the
extension can’t build or load, rather than silently falling back to PyTorch.
Alternative: WebRTC server#
For deployments that require a richer browser frontend with WebRTC’s
lower video-delivery latency and a streaming gRPC service for
multi-client setups, the standalone server at
omnidreams.webrtc.server ships a polished HTML5 client on top of
the same OmniDreams pipeline. The MJPEG path above is the
recommended starting point for most users; consider WebRTC if you
need bidirectional camera-control APIs or are already integrating
the gRPC service into a larger product.
# from the repo root
uv run --package flashdreams-omnidreams torchrun --nproc_per_node 1 \
-m omnidreams.webrtc.server \
--host 0.0.0.0 --port 8089 \
--pipeline_config_name omnidreams-sv-2steps-chunk2-loc6-lightvae-lighttae-perf \
--scene-uuid "0d404ff7-2b66-498c-b047-1ed8cded60d4"
Sample scene UUIDs for the interactive server are available in the
nvidia/omni-dreams-scenes Hugging Face dataset.
Each scene ships clear, rain, and snow weather variants as sibling
archives; add --scene-variant rain (or snow) to serve a specific
one (the default is the clear-weather scene).
The server may take a few minutes to warm up. Once ready, it prints
Connect via http://<server-ip>:8089/request_session.
Here, <server-ip> is the server IP address you are connecting to
(can use localhost when running locally).
Note
On a remote or cloud GPU instance (e.g. Brev),
the server port is usually not reachable at the host IP directly.
Forward it to your local machine first, then open
http://localhost:8089/request_session:
# Brev
brev port-forward <instance> -p 8089:8089
# or plain SSH
ssh -L 8089:localhost:8089 <user>@<host>
Once successfully connected, the browser-based UI looks like this:
Note
If /request_session loads but the video never appears, the
browser is likely obfuscating local IPs in WebRTC ICE candidates
(replacing them with mDNS .local hostnames), which prevents the
peer connection from completing. Disable the setting and reload:
Chrome / Edge:
chrome://flags/#enable-webrtc-hide-local-ips-with-mdns→ Disabled, then restart the browser.Brave:
brave://settings/privacy/security→ WebRTC IP handling policy → Default public and private interfaces.Firefox:
about:config→media.peerconnection.ice.obfuscate_host_addresses→ false.
Performance table#
Single-view latency on NVIDIA GB300 at 704 x 1280 resolution.
Stage |
1x GPU |
2x GPU |
4x GPU |
8x GPU |
|---|---|---|---|---|
HDMap Encoder |
28 ms |
26 ms |
26 ms |
26 ms |
Diffusion DiT |
84 ms |
71 ms |
49 ms |
47 ms |
VAE Decoder |
6 ms |
5 ms |
5 ms |
5 ms |
KV-cache Update |
42 ms |
34 ms |
23 ms |
22 ms |
Total |
118 ms |
102 ms |
80 ms |
78 ms |
Effective FPS |
68 |
78 |
100 |
103 |
KV-cache Update is off the hot path and excluded from Total.
Further reading#
Interactive-drive latency tuning covers the supported
interactive-drivelatency knobs: model and backend choice, resolution, chunk-size constraints, FP8 and native acceleration, transport, and the validated GB300 reference.
Citation#
If you use OmniDreams, please cite the original work:
@misc{nvidia2026omnidreams,
title={OmniDreams: Real-Time Generative Closed-Loop Autonomous Vehicle Simulation Built on NVIDIA Cosmos},
author={Basant, Aarti and Kar, Amlan and Paschalidou, Despoina and Garcia Cobo, Guillermo and Turki, Haithem and Ling, Huan and Seo, Jaewoo and Wang, Jialiang and Lucas, James and Wu, Jay and Lorraine, Jonathan and Gao, Jun and He, Kai and Tothova, Katarina and Xie, Kevin and Tyszkiewicz, Michal and Wu, Qi and de Lutio, Riccardo and Li, Ruilong and Fidler, Sanja and Kim, Seung Wook and Shen, Tianchang and Cao, Tianshi and Pfaff, Tobias and Lew, William and Ren, Xuanchi and Lu, Yifan and Gojcic, Zan and Wang, Zian},
year={2026},
note={Technical report},
howpublished={\url{https://research.nvidia.com/labs/sil/projects/omnidreams-blog/paper.pdf}}
}