Understanding Task Execution#

When you submit a workflow to OSMO, each task runs as a Kubernetes pod on your backend cluster. This page explains the technical architecture of these pods—how they’re structured, how the containers inside the pod communicate, and what happens during execution.

Tip

Why read this? Understanding OSMO workflow pod execution helps you debug issues, optimize data operations, and provide necessary support to users in case of workflow failures or when they use interactive features like exec and port-forward.

Task Pod Architecture#

Every workflow task executes as a Kubernetes pod with three containers that work together. These containers share volumes (/osmo/data/input and /osmo/data/output) and communicate via Unix sockets to seamlessly orchestrate your task from data download through execution to results upload.

The Three Containers#

Each pod contains three containers working together to execute your task:

OSMO Init

(Init Container)

Prepares the environment before your code runs:

Creates /osmo/data/input and /osmo/data/output directories
Installs OSMO CLI (available in your container)
Sets up Unix socket for inter-container communication

Runs once → Exits after setup

OSMO Ctrl

(Sidecar Container)

Coordinates task execution and data:

Downloads input data from cloud storage
Streams logs to OSMO service in real-time for monitoring
Uploads output artifacts after completion
Handles interactive requests (exec, port-forward)

Runs throughout task lifetime

User Container

(Main Container)

Runs your code as a process:

Executes the command you specified
Uses requested CPU/GPU/memory resources
Reads input data from /osmo/data/input
Writes output data to /osmo/data/output
Logs to stdout/stderr

Runs your code from start to exit

Execution Flow#

Every task follows this four-phase progression:

1. Initialize 🔧

OSMO Init sets up environment

Creates directories, installs OSMO CLI, configures Unix socket for inter-container communication

Time in the order of 5-10 seconds

2. Download ⬇️

OSMO Ctrl fetches data

Downloads and extracts input datasets to /osmo/data/input

Time varies by download size

3. Execute ▶️

Your code runs inside the container

Reads inputs, writes outputs, logs streamed in real-time

Time varies by code runtime

4. Upload ⬆️

OSMO Ctrl saves results

Uploads artifacts from /osmo/data/output

Time varies by upload size

Note

Data handling is automatic. Your code only needs to read from {{input}} (/osmo/data/input) and write to {{output}} (/osmo/data/output). The Ctrl container manages all transfers.

Practical Guide#

Directory Structure#

Your container automatically has access to these paths:

/osmo/data/
├── input/              ← Read input datasets here
│   ├── 0/dataset1/
│   └── 1/dataset2/
├── output/             ← Write results here
│   └── (your artifacts)
└── socket/             ← Unix socket (managed by OSMO)
    └── data.sock

Example Task Configuration#

tasks:
  - name: train-model
    image: nvcr.io/nvidia/pytorch:24.01-py3
    command: ["python", "train.py"]
    args:
      - --input={{input:0}}/dataset1
      - --input={{input:1}}/dataset2
      - --output={{output}}/model

Understanding Task Execution#

Task Pod Architecture#

The Three Containers#

Execution Flow#

Practical Guide#

Directory Structure#

Example Task Configuration#

Debugging#

Learn More#