Televiz#

Televiz (isaacteleop.viz) is a lightweight compositor for Isaac Teleop. It composites camera and sensor feeds — with 3D rendered content coming soon — into an XR headset, a desktop window, or an offscreen buffer, integrating directly with the device-tracking and retargeting pipeline.

It is a compositor, not a capture or streaming layer: it consumes GPU frames and assembles them into a final image. Camera capture, decode, and network transport live in the application (see Camera Streaming).

The compositor is implemented in C++ (namespace viz, built on Vulkan + OpenXR + CUDA with no external rendering-framework dependency) and exposed through a pybind11 binding. This page uses the Python API, which mirrors the C++ names one-to-one — see C++ API to link against the library directly.

Overview#

The central object is VizSession, which owns the Vulkan context, the display target, the OpenXR session (in XR mode), and a registry of layers. Content producers submit GPU buffers to layers; the session composites every layer into one frame each time you call render().

The built-in layer type today is QuadLayer — a CUDA-fed 2D texture plane (mono or stereo), optionally placed in 3D space. Use it for camera feeds.

Note

Coming soon: ProjectionLayer, a full-view stereo RGBD layer for external renderers (gsplat, nvblox, neural reconstruction) that produce per-view (color, depth) buffers, Z-composited with quads. It is not yet available in this release — see ProjectionLayer (coming soon) below.

All symbols are imported from the top-level module:

import isaacteleop.viz as televiz

Display modes#

A session runs in exactly one display mode, set on the config:

DisplayMode

Behavior

kXr

OpenXR + Vulkan. Per-eye swapchains, stereo rendering, depth composition layer. Requires a running OpenXR runtime (e.g. CloudXR).

kWindow

GLFW desktop window. Layers are aspect-fit tiled; stereo layers show the left eye.

kOffscreen

No display. Composite to an internal target and pull pixels back with readback_to_host(). Useful for tests and headless rendering.

Quick start#

A minimal offscreen render-and-readback (no GPU display, no headset):

import cupy as cp
import isaacteleop.viz as televiz

cfg = televiz.VizSessionConfig()
cfg.mode = televiz.DisplayMode.kOffscreen
cfg.window_width = 1024
cfg.window_height = 1024

session = televiz.VizSession.create(cfg)

layer_cfg = televiz.QuadLayerConfig()
layer_cfg.name = "cam"
layer_cfg.resolution = televiz.Resolution(1024, 1024)
layer = session.add_quad_layer(layer_cfg)

# Any __cuda_array_interface__ array (CuPy / PyTorch / Numba) or a VizBuffer.
frame = cp.zeros((1024, 1024, 4), dtype=cp.uint8)   # RGBA8
layer.submit(frame)

info = session.render()                # wait + composite + present
img = session.readback_to_host()       # HostImage; numpy.asarray(img) for pixels

session.destroy()

For a window or headset, set mode to DisplayMode.kWindow or DisplayMode.kXr instead. The layer setup is identical; you just drive a frame loop (see Frame loop) rather than the one-shot readback_to_host(), which is offscreen-only.

Session configuration#

VizSessionConfig fields:

Field

Default

Description

mode

DisplayMode.kXr / kWindow / kOffscreen. Required.

window_width / window_height

Render size for window and offscreen modes. Ignored in XR (the runtime dictates per-eye resolution; query it with get_recommended_resolution()).

app_name

"televiz"

OpenXR application name.

required_extensions

[]

Extra OpenXR instance extensions to enable when Televiz hosts the session and downstream components (e.g. TeleopSession trackers) need them. Televiz already enables its own rendering extensions. See Sharing the XR session.

xr_near_z / xr_far_z

Near / far planes for the XR projection.

xr_system_wait_seconds

How long to wait for the OpenXR system (headset) to become available at create time.

clear_color

Background color as an (r, g, b, a) sequence in [0, 1].

gpu_timing

Enable GPU timestamp queries, surfaced via get_gpu_timing().

Construct the session with the factory; never call the class directly:

session = televiz.VizSession.create(cfg)

Layers#

Layers render in insertion order — the first added renders first (underneath). A layer is owned by the session; add_quad_layer returns a non-owning handle, so don’t keep it past the session’s lifetime.

QuadLayer#

A 2D plane fed by a CUDA buffer. Configure it with QuadLayerConfig:

Field

Default

Description

name

Layer name (used as the placement key in app config).

resolution

Source texture size, a Resolution. Submitted buffers must match it.

format

PixelFormat of the source (typically kRGBA8).

placement

Optional QuadLayerPlacement (pose + size_meters) for 3D placement in XR.

stereo

False

Per-eye stereo. When True, submit requires both eyes’ buffers; view 0 (left) samples the left buffer, view 1 (right) the right. Memory doubles.

stereo_baseline_mm

0

Horizontal disparity between the left/right planes (mm), along the placement’s local +x axis. 0 → both eyes see the same world quad. XR + stereo only.

generate_mipmaps

True

Allocate + regenerate a capped mip chain each frame; sampler uses trilinear filtering.

Submit and place a frame:

layer = session.add_quad_layer(layer_cfg)

# Mono: pass exactly one buffer. Stereo: layer.submit(left, right).
layer.submit(rgba_array)            # optional: stream=<cuda stream ptr>

# 3D placement (XR). Pose is OpenXR stage space: position (x,y,z),
# orientation quaternion (w,x,y,z). size_meters is (width, height).
placement = televiz.QuadLayerPlacement(
    televiz.Pose3D(position=(0.0, 0.0, -1.5), orientation=(1.0, 0.0, 0.0, 0.0)),
    size_meters=(1.0, 0.5625),
)
layer.set_placement(placement)
layer.set_visible(True)

submit(left, right=None, stream=0) accepts a VizBuffer or any __cuda_array_interface__ object; the binding converts it and releases the GIL across the copy. For a stereo layer both buffers are copied on the same stream and signaled together, so the renderer never sees a half-matched pair. Lock-mode placement strategies (world / head / lazy) are application policy and ship in the sample, not in the module.

ProjectionLayer (coming soon)#

Note

ProjectionLayer is under active development and not yet available in this release. The description below is a preview of the planned API and may change.

A planned full-view RGBD layer for in-loop renderers — gsplat, nvblox, or neural reconstruction engines that produce per-view (color, depth) buffers. Unlike QuadLayer, the renderer will run in-loop with the XR frame loop: render against the predicted view poses from the current frame, then submit between begin_frame() and end_frame(). Output is composited with depth, so it Z-combines with quad layers.

Frame loop#

Two API levels drive the frame loop. Both release the GIL during blocking waits.

Conveniencerender() does wait + composite + present in one call and returns a FrameInfo. Internally it checks should_render and skips the GPU pass when the runtime says the frame won’t be visible; producers’ submit writes still land in the back buffer.

while running:
    cam_layer.submit(camera_frame)
    info = session.render()

Explicitbegin_frame() / end_frame(), when the app needs the FrameInfo before submitting (e.g. to read the predicted view poses before rendering, or to skip expensive decode when not visible):

while running:
    info = session.begin_frame()
    if info.should_render:
        cam_layer.submit(decode_camera())   # skip decode when not visible
    session.end_frame()

FrameInfo carries frame_index, predicted_display_time (XR time in ns; 0 outside XR), delta_time (CPU wall-clock seconds — usable without any XR knowledge), should_render, resolution, and views. Each ViewInfo in views has viewport, fov, and pose — 2 entries in XR stereo, 1 (identity pose) in window / offscreen.

Session state#

A session moves through SessionState:

kUninitialized kReady kRunning kStopping kLost kDestroyed

  • kReady after create — add layers and submit content.

  • kRunning once the frame loop is active.

  • kStopping (XR) — the runtime is stopping; end_frame submits empty frames.

  • kLost (XR) — the session was lost; render / begin_frame raise. Destroy and recreate the VizSession (Televiz supports clean in-process recreation).

  • kDestroyed after destroy.

Query it with get_state(); in window mode should_close() reports the window-close request. OpenXR events are polled inside begin_frame, which drives the XR-specific transitions.

Sharing the XR session#

Only one OpenXR session is allowed per process. In XR mode VizSession creates a graphics-bound session (Isaac Teleop’s own OpenXRSession is headless and cannot render). When you use Televiz and TeleopSession together, let Televiz own the session and hand its live handles to TeleopSession so trackers attach to the same session — one CloudXR connection, synchronized timing.

Declare the extensions your trackers need in required_extensions (Televiz adds its own rendering extensions automatically), then pass the handles through:

import isaacteleop.viz as televiz
from isaacteleop.teleop_session_manager import TeleopSession, TeleopSessionConfig
from isaacteleop.deviceio import DeviceIOSession
from teleopcore.oxr import OpenXRSessionHandles

cfg = televiz.VizSessionConfig()
cfg.mode = televiz.DisplayMode.kXr
# Aggregate the XR extensions downstream trackers need so they're present
# on the XrInstance Televiz is about to create.
cfg.required_extensions = DeviceIOSession.get_required_extensions(trackers)
viz_session = televiz.VizSession.create(cfg)

config = TeleopSessionConfig(
    app_name="MyApp",
    pipeline=pipeline,
    oxr_handles=OpenXRSessionHandles(*viz_session.get_oxr_handles()),
)
with TeleopSession(config) as session:
    while running:
        session.step()
        cam_layer.submit(camera_frame)
        viz_session.render()

get_oxr_handles() returns (instance, session, space, proc_addr) as raw uint64 values (or None outside kXr); wrap them with OpenXRSessionHandles(*tuple) for TeleopSessionConfig.oxr_handles. VizSession and TeleopSession keep independent state machines and lifecycles — either can run without the other. In the unified pattern the underlying OpenXR session is shared, so if it is lost both must be recreated. See Teleop Session for the TeleopSession side.

API reference#

VizSession#

  • create(config) -> VizSession (static) — validate config + initialize Vulkan / display backend.

  • render() -> FrameInfo — wait + composite + present.

  • begin_frame() -> FrameInfo / end_frame() — explicit two-phase frame loop.

  • add_quad_layer(config) -> QuadLayer — construct + register a layer; returns a non-owning handle.

  • readback_to_host() -> HostImage — most recent frame as RGBA8 host pixels (kOffscreen only).

  • get_state() -> SessionState, should_close() -> bool, is_xr_mode() -> bool.

  • get_recommended_resolution() -> Resolution — runtime per-eye resolution (XR).

  • head_pose_now() -> Optional[Pose3D] — current head pose (kXr only; None on tracking loss).

  • get_oxr_handles() -> Optional[tuple](instance, session, space, proc_addr) as raw uint64.

  • get_frame_timing_stats() -> FrameTimingStats / get_gpu_timing() -> GpuFrameTiming.

  • destroy() — release all resources (idempotent).

  • Properties vk_device / vk_physical_device / vk_queue_family_index — raw handles for wiring Televiz into a foreign Vulkan app. Most users won’t touch these.

QuadLayer#

  • submit(left, right=None, stream=0) — submit a frame (mono: left only; stereo: both).

  • set_placement(placement) / placement() — 3D placement (None → fullscreen, window mode).

  • set_visible(visible) / is_visible().

  • Properties resolution, format, aspect_ratio, name.

Data types#

  • VizBuffer — non-owning 2D pixel buffer descriptor. Device buffers expose __cuda_array_interface__ (cupy.asarray(buf)); host buffers expose __array_interface__ (numpy.asarray(buf)).

  • HostImage — owning host pixel buffer returned by readback_to_host; wrap with numpy.asarray.

  • Resolution (width, height), Pose3D (position, orientation as (w, x, y, z)), Fov, ViewInfo.

  • Enums DisplayMode, PixelFormat (kRGBA8 / kD32F), MemorySpace, SessionState.

C++ API#

Televiz is a C++ library; isaacteleop.viz is a thin pybind11 binding over it. The Python and C++ APIs share the same type and method names (VizSession, QuadLayer, submit, set_placement, DisplayMode::kXr, …), so everything on this page maps directly to C++. All symbols live in namespace viz, and headers use nested include paths:

#include <viz/session/viz_session.hpp>
#include <viz/layers/quad_layer.hpp>
#include <viz/core/viz_buffer.hpp>

Enable BUILD_VIZ (default OFF; requires Vulkan, the CUDA toolkit, and glslangValidator) and link the relevant CMake target:

Target

Alias

Provides

viz_core

viz::core

Core types (VizBuffer, Pose3D, HostImage, DeviceImage) and Vulkan / CUDA infrastructure

viz_layers

viz::layers

LayerBase and the built-in layers (QuadLayer, …)

viz_session

viz::session

VizSession, the compositor, FrameInfo, window / offscreen backends

viz_xr

viz::xr

OpenXR backend — per-eye swapchains, depth composition layer

Public headers live under src/viz/<module>/cpp/inc/viz/. One difference from the Python bindings: in C++, layers are added with a single templated VizSession::add_layer<L>(args...) method, which also accepts your own LayerBase subclasses — the route for plugging in a custom renderer. See Build System for build options and output locations.

More information#

  • Camera Streaming — the reference camera_viz sample built on Televiz

  • Teleop Session — how TeleopSession works and how to share its OpenXR session

  • src/viz/ — module source, organized as core / layers / session / xr / shaders / python sub-modules