XR AI#

Build AI agents that see and hear what your users experience in XR, and respond in real time.

XR AI is an open-source stack that connects web, iOS/visionOS, AR-glasses, and XR-headset clients to GPU-accelerated AI services and tool-using agents. An agent can perceive live physical context, call tools through MCP, and reply with audio or data in the same session. For remote-rendered AR and XR, XR AI integrates NVIDIA CloudXR, as the xr-render-demo sample shows.

It is especially useful when you need to:

  • Build multimodal XR agents that see, hear, reason, use tools, and respond in real time.

  • Target multiple client platforms — web, iOS/visionOS, AR glasses, and XR headsets.

  • Use NVIDIA open models out of the box while keeping the freedom to bring your own.

  • Deploy wherever NVIDIA GPUs run — cloud, data center, workstation, or edge.

  • Integrate NVIDIA CloudXR for remote-rendered AR and XR, as the xr-render-demo sample shows.

  • Keep transport, rendering, model services, tools, and agent logic separated so each layer evolves independently.

🚀 Get started

Run a sample in minutes: the model servers, the simple VLM agent, or the full xr-render demo.

Quickstart
🧩 Architecture

How XR-Media-Hub, the transport, and agents fit together.

Architecture
🛠️ Components

The server runtime, agent SDK, MCP servers, AI services, and the launcher.

Components
📦 Build a sample

Wire your own agent worker into the stack.

Adding a new sample