Version: develop

Backends

A backend provides compute resources for task execution. v0.1 ships with local and slurm backends.

Default behavior (simplest)

If you omit backends: entirely, sflow creates a default local backend:

backend: local (synthetic allocation: localhost, localhost-1, ...)
default operator: bash

This is why a minimal workflow with just workflow: and tasks: works without any backend/operator config.

Explicit local backend

Explicit local backend example:

version: "0.1"

backends:
  - name: local
    type: local
    default: true

workflow:
  name: wf
  tasks:
    - name: hello
      script:
        - echo hello

Slurm backend

Slurm backend example:

version: "0.1"

backends:
  - name: slurm_cluster
    type: slurm
    default: true
    account: "your_slurm_account"
    partition: "your_slurm_partition"
    time: "00:10:00"
    nodes: 1
    gpus_per_node: 8        # required for sflow planning; set to 0 for CPU-only partitions

workflow:
  name: wf
  tasks:
    - name: slurm_task
      script:
        - echo hello

CPU-only partitions

Set gpus_per_node: 0 to target a CPU-only Slurm partition. With zero capacity, tasks that declare resources.gpus will be rejected up front with a clear error, preventing silent CUDA failures at runtime.

note

gpus_per_node describes cluster topology for sflow resource planning and GPU index assignment. It does not add --gpus-per-node to salloc. If your cluster requires that Slurm allocation flag, add it explicitly in backend extra_args.

Notes:

If you don't specify task.operator, the backend chooses its default operator:
- local backend → bash
- slurm backend → srun
You can run sflow asynchronously via sbatch:
- sbatch returns immediately with a job id; sflow runs inside the batch allocation.
- In this mode, sflow will reuse the current allocation (no extra salloc).
- Make sure your --workspace-dir/--output-dir point to a shared filesystem so you can inspect logs while it runs.

Example:

sbatch --job-name=sflow --output=sflow-%j.out --wrap "cd $SLURM_SUBMIT_DIR && sflow run --file sflow.yaml"

Cluster-specific flags (`extra_args`)

Some Slurm clusters require additional flags for job submission (e.g., GPU resources, network segments, or custom policies). Use the extra_args section to pass these cluster-specific options:

version: "0.1"

backends:
  - name: gpu_cluster
    type: slurm
    default: true
    account: "myproject"
    partition: "gpu"
    time: "01:00:00"
    nodes: 2
    gpus_per_node: 8
    extra_args:
      - "--gpus-per-node=8"
      - "--segment=2"
      - "--exclusive"

workflow:
  name: wf
  tasks:
    - name: gpu_task
      script:
        - nvidia-smi
        - echo "Running on GPU nodes"

Common cluster-specific flags include:

Flag	Description
`--gpus-per-node=N`	Request N GPUs per node
`--segment=<name>`	Target a specific network segment or job class, usually GB200 / GB300
`--exclusive`	Request exclusive node access
`--mem=<size>`	Memory per node (e.g., `128G`)

tip

Check your cluster's documentation or run sinfo / scontrol show partition to discover available partitions, segments, and resource constraints.

note

When using sflow batch mode, you can also pass extra Slurm flags directly via the -e flag without modifying the YAML file:

sflow batch -f workflow.yaml -e "--gpus-per-node=8" -e "--segment=2"

This is useful for quick adjustments or when testing different cluster configurations.

Default behavior (simplest)​

Explicit local backend​

Slurm backend​

Cluster-specific flags (extra_args)​

Default behavior (simplest)

Explicit local backend

Slurm backend

Cluster-specific flags (`extra_args`)