XR AI#

Build AI agents that see and hear what your users experience in XR, and respond in real time.

XR AI is an open-source stack that connects web, iOS/visionOS, AR-glasses, and XR-headset clients to GPU-accelerated AI services and tool-using agents. An agent can perceive live physical context, call tools through MCP, and reply with audio or data in the same session. For remote-rendered AR and XR, XR AI integrates NVIDIA CloudXR, as the xr-render-demo sample shows.

It is especially useful when you need to:

Build multimodal XR agents that see, hear, reason, use tools, and respond in real time.
Target multiple client platforms — web, iOS/visionOS, AR glasses, and XR headsets.
Use NVIDIA open models out of the box while keeping the freedom to bring your own.
Deploy wherever NVIDIA GPUs run — cloud, data center, workstation, or edge.
Integrate NVIDIA CloudXR for remote-rendered AR and XR, as the xr-render-demo sample shows.
Keep transport, rendering, model services, tools, and agent logic separated so each layer evolves independently.

🚀 Get started

Run a sample in minutes: the model servers, the simple VLM agent, or the full xr-render demo.

Quickstart

🧩 Architecture

How XR-Media-Hub, the transport, and agents fit together.

Architecture

🛠️ Components

The server runtime, agent SDK, MCP servers, AI services, and the launcher.

Components

📦 Build a sample

Wire your own agent worker into the stack.

Adding a new sample