User Guide#

OSMO is an open-source workflow orchestration platform purpose-built for Physical AI and robotics development.

Write your entire development pipeline for physical AI (training, simulation, hardware-in-loop testing) in declarative YAML. OSMO automatically coordinates tasks across heterogeneous compute, managing dependencies and resource allocation for you.

🚀 From workstation to cloud in minutes

Develop on your laptop. Deploy to EKS, AKS, GKE, on-premise, or air-gapped clusters. Zero code changes.

Physical AI development uniquely requires orchestrating three types of compute:

🧠 Training GPUs (GB200, H100) for deep learning and reinforcement learning
🌐 Simulation Hardware (RTX PRO 6000) for realistic physics and sensor rendering
🤖 Edge Devices (Jetson AGX Thor) for hardware-in-the-loop testing and validation

OSMO solves this Three Computer Problem for robotics by orchestrating your entire robotics pipeline with simple YAML workflows—no custom scripts, no infrastructure expertise required. By solving this fundamental challenge, OSMO brings us one step closer to making Physical AI a reality.

Why Choose OSMO#

🚀 Zero-Code Orchestration

Write workflows in simple YAML - no coding overhead. Define what you want to run, OSMO handles the rest.

⚡ Group Scheduling

Run training, simulation, and edge testing simultaneously across heterogeneous hardware in a single workflow.

🌐 Truly Portable

Same workflow runs on your laptop, cloud, or on-premise—no infrastructure rewrites as you scale.

💾 Smart Storage

Content-addressable datasets with automatic deduplication save 10-100x on storage costs.

🔧 Interactive Development

Launch VSCode, Jupyter, or SSH into running tasks for live debugging and development.

🎯 Infrastructure-Agnostic

Write workflows without knowing (or caring) about underlying infrastructure. Focus on robotics, not DevOps.

How It Works#

1. Define 📝

Write your workflow in YAML

Describe tasks, resources, and dependencies

2. Submit 🚀

Launch via CLI or web UI

Submit workflow, notified on completion

3. Execute ⚙️

OSMO orchestrates tasks in workflow

Schedule tasks, manage dependencies

4. Iterate 🔄

Access results and refine

Versioned datasets, real-time monitoring

Example Workflow:

# Your entire physical AI pipeline in a YAML file
workflow:
  tasks:
  - name: simulation
    image: nvcr.io/nvidia/isaac-sim
    platform: rtx-pro-6000          # Runs on NVIDIA RTX PRO 6000 GPUs

  - name: train-policy
    image: nvcr.io/nvidia/pytorch
    platform: gb200                 # Runs on NVIDIA GB200 GPUs
    resources:
      gpu: 8
    inputs:                         # Feed the output of simulation task into training
     - task: simulation

  - name: evaluate-thor
    image: my-robot:latest
    platform: jetson-agx-thor       # Runs on NVIDIA Jetson AGX Thor
    inputs:
     - task: train-policy           # Feed the output of the training task into eval
    outputs:
     - dataset:
         name: thor-benchmark       # Save the output benchmark into a dataset

Key Benefits#

What You Can Do	Example Tutorial
Interactively develop on remote GPU nodes with VSCode, SSH, or Jupyter notebooks	Interactive Workflows
Generate synthetic data at scale using Isaac Sim or custom simulation environments	Isaac Sim SDG
Train models with diverse datasets across distributed GPU clusters	Model Training
Train policies for robots using data-parallel reinforcement learning	Reinforcement Learning
Validate models in simulation with hardware-in-the-loop testing	Hardware In The Loop
Transform and post-process data for iterative improvement	Working with Data
Benchmark system software on actual robot hardware (NVIDIA Jetson, custom platforms)	Hardware Testing

Bring Your Own Infrastructure#

Flexible Compute

Connect any Kubernetes cluster to OSMO—cloud (AWS EKS, Azure AKS, Google GKE), on-premise clusters, or embedded devices like NVIDIA Jetson. OSMO enables you to share resources efficiently, optimizing for GPU utilization across heterogeneous hardware.

Flexible Storage

Connect any S3-compatible object storage or Azure Blob Storage. Store datasets and models with automatic version control, content-addressable deduplication, and seamless access across all compute backends.