Skip to main content
Version: develop

Configuration

sflow uses a YAML config file (default name: sflow.yaml). Top-level structure:

version: "0.1"
variables: ...
artifacts: ...
backends: ...
operators: ...
workflow: ...

This page follows the current schema in code (src/sflow/config/schema.py).

tip

Looking for a quick lookup of all config fields? See the Quick Reference table.

version

Currently supported:

version: "0.1"

variables

Variables can be written as a dict or a list (they are normalized internally).

Dict form (recommended):

variables:
SLURM_PARTITION:
description: "Slurm partition"
value: debug

List form:

variables:
- name: SLURM_PARTITION
description: "Slurm partition"
value: debug

How to use:

  • In YAML expressions: ${{ variables.SLURM_PARTITION }}
  • In task scripts (as env var): ${SLURM_PARTITION}

Override variables via CLI (--set)

sflow run --file sflow.yaml --set SLURM_PARTITION=debug --set NUM_GPUS=4

Notes:

  • --set can only override variables that already exist in the config; otherwise it errors.
  • Values use simple type inference (int/float/bool/list/string).
  • List values set the variable domain for replica sweeps, and the variable value becomes the first item.

You can also read a variable's domain inside expressions:

script:
- echo "all concurrencies=${{ variables.CONCURRENCY.domain }}"
- echo "max concurrency=${{ variables.CONCURRENCY.domain | max }}"

artifacts

artifacts are “named resources” you can reference via ${{ artifacts.NAME.path }} in expressions.

In v0.1, the code resolves uri to a local path only for:

  • fs://<path>
  • file://<path>

Other schemes are kept as-is (path remains the URI string). No automatic download/pull happens today.

Example:

artifacts:
model_dir:
uri: fs://./models/qwen

Override artifacts via CLI (--artifact)

sflow run --file sflow.yaml --artifact model_dir=fs:///mnt/models/qwen

Same requirement: the artifact must already be defined in artifacts, otherwise it errors.

backends

local backend

backends:
local:
type: local
default: true
nodes: 1

slurm backend

backends:
slurm_cluster:
type: slurm
default: true
account: ${{ variables.SLURM_ACCOUNT }}
partition: ${{ variables.SLURM_PARTITION }}
time: 00:30:00
nodes: 2
gpus_per_node: 8 # sflow planning only
extra_args:
- "--gpus-per-node=8" # passed to salloc

gpus_per_node tells sflow how many GPU indices each node has for planning and packing. It does not add --gpus-per-node to salloc; include that flag in extra_args when your cluster requires it.

If you are already inside a Slurm allocation (e.g. via salloc or sbatch), you can use:

sflow run --file sflow.yaml

This skips salloc and attempts to infer node info from the current environment (SLURM_JOB_ID/SLURM_JOB_NODELIST).

operators

An operator defines how a task is launched. Two common ones:

  • bash: run locally via bash
  • srun: run via Slurm srun (supports common Pyxis --container-* flags)

Example (bash):

operators:
local_bash:
type: bash

Example (srun + container):

operators:
runtime:
type: srun
container_image: nvcr.io/xxx/yyy:tag
container_mount_home: false
container_mounts:
- "/mnt:/mnt:rw"
extra_args:
- "--shm-size=64g"

workflow / tasks

Minimal task:

workflow:
name: demo
tasks:
- name: step1
script:
- echo "hello"

depends_on

- name: step2
depends_on: [step1]
script:
- echo "step2"

resources (nodes / GPUs)

- name: server
resources:
nodes:
indices: [0]
gpus:
count: 4
script:
- echo "server on node0, 4 gpus"

resources.nodes supports indices, count, and exclude. exclude removes nodes from the allocation before indices, count, or GPU packing are applied:

- name: workers
resources:
nodes:
exclude: [0]
count: 2

resources.nodes.release_after and resources.gpus.release_after control when that resource kind can be reused by later tasks in the DAG. Node reservations are only exclusive when resources.nodes.release_after is explicitly set; when omitted, both resources.nodes.indices and resources.nodes.count are placement constraints and may overlap with other planned tasks. GPU reservations infer the safe behavior when omitted: tasks without readiness probes release GPUs after task completion for downstream dependents, while tasks with readiness probes keep GPUs until workflow completion because they may still be running after they become ready. Use task_ready when a task can release a resource after its readiness probe succeeds, task_completion when the resource can be reused after the task reaches any terminal status (COMPLETED, FAILED, TIMEOUT, or CANCELLED), or workflow_completion to explicitly reserve it for the whole workflow:

- name: check_entire_env
resources:
gpus:
count: 8
script:
- nvidia-smi

- name: worker
depends_on: [check_entire_env]
replicas:
count: 4
policy: parallel
resources:
gpus:
count: 2

Policies are independent per resource kind, so a task can release GPUs at readiness while keeping a node reservation until workflow completion. Dry-run performs a scheduler rehearsal for these lifetimes; it validates temporal resource usage rather than only summing all declared resources.

replicas

Run multiple instances of a task in parallel or sequentially:

- name: worker
replicas:
count: 4
policy: parallel
script:
- echo "replica $SFLOW_REPLICA_INDEX"

See Replicas for details on policies, variable sweeps, and GPU allocation.

probes (readiness / failure)

Probes are useful for service-style tasks (e.g. start a server, then run a client):

- name: api_server
script:
- python -m http.server 8000
probes:
readiness:
tcp_port:
port: 8000

probes.readiness may also be a list of probes; all must trigger before the task is ready. Probe types include tcp_port, http_get, http_post, and log_watch. probes.failure marks a running task as failed when its condition is detected, which fail-fast uses to cancel downstream work.