fio#

Fio runs the open-source fio tool from CloudAI. It supports standalone and Slurm systems.

Scenarios#

Use conf/common/test_scenario/fio.toml for Slurm clusters. It runs the common fio test in a container and starts 4 tasks total: num_nodes = 2 and num_tasks_per_node = 2.

Use conf/common/test_scenario/fio_local.toml for local/standalone runs. It uses host-installed fio and no container image.

Fio Arguments#

Put fio CLI options under cmd_args.args. CloudAI does not declare a fixed fio option list; keys pass through verbatim.

name = "fio"
description = "fio random write smoke test"
test_template_name = "Fio"

[cmd_args]
fio_binary = "fio"

  [cmd_args.args]
  name = "fio-smoke"
  filename = "/tmp/cloudai-fio-test"
  rw = "randwrite"
  bs = "128k"
  size = "80m"
  iodepth = 1
  numjobs = 1
  group_reporting = true
  thread = true

This emits options like --rw=randwrite and --group_reporting. Quote TOML keys when option names contain characters such as -:

[cmd_args.args]
"max-jobs" = 4

List values are CloudAI DSE sweeps:

[cmd_args.args]
bs = ["128k", "1m"]
iodepth = [1, 8, 32]

To repeat the same fio option in one run, use a nested table. Nested key names are only stable TOML item names; values become fio values:

[cmd_args.args.a]
"0" = "=foo"  # --a==foo
"1" = "bar"   # --a=bar

[cmd_args.args."client"]
"0" = "host1" # --client=host1
"1" = "host2" # --client=host2

For existing or complex fio configs, use job_file:

[cmd_args]
fio_binary = "/tmp/fio/fio"
job_file = "/tmp/kv_emulation.fio"

Slurm Tasks#

Scenario num_nodes controls node count. cmd_args.num_tasks_per_node controls Slurm tasks per node. Total fio tasks = num_nodes * num_tasks_per_node. CloudAI passes this as --ntasks and --ntasks-per-node.

[[Tests]]
id = "Tests.fio"
test_name = "fio"
num_nodes = 2

  [Tests.cmd_args]
  docker_image_url = "openeuler/fio:3.42-oe2403sp3"
  num_tasks_per_node = 2

Default Metric#

For agent_metrics = ["default"], CloudAI aggregates parsed fio summary rows. Defaults report total bandwidth across all operations and tasks. Bandwidth metrics are normalized to MiB/s before aggregation. Latency metrics are normalized to usec before aggregation.

[cmd_args]
metric_operation = "all"    # read, write, trim, all, first
metric_name = "bw"          # bw, iops, latency
metric_aggregate = "sum"    # sum, mean, min, max, first

Raw parsed rows and normalized metric values are written to fio_summary.csv.

API Documentation#

Command Arguments#

class cloudai.workloads.fio.fio.FioCmdArgs(

*,

fio_binary: str = 'fio',

job_file: str | None = None,

args: dict[str,

~typing.Any] = <factory>,

docker_image_url: str | None = None,

num_tasks_per_node: int | None = 1,

metric_operation: str = 'all',

metric_name: str = 'bw',

metric_aggregate: str = 'sum',

**extra_data: ~typing.Any,

)[source]#

Bases: CmdArgs

Command line arguments for fio.

field fio_binary: str = 'fio'#: fio executable to run. Use an absolute path for patched/custom fio builds.

field job_file: str | None = None#: Optional fio job/config file. When set, it is appended after CLI options.

field args: dict[str, Any] [Optional]#: fio CLI options, without leading --. Keys are passed to fio verbatim.

field docker_image_url: str | None = None#: Optional Docker image to use for Slurm container execution.

field num_tasks_per_node: int | None = 1#: Optional Slurm task count per node for multi-node fio runs.

field metric_operation: str = 'all'#: Operation used for the default metric: read, write, trim, all, or first.

field metric_name: str = 'bw'#: Metric used for the default metric: bw, iops, or latency.

field metric_aggregate: str = 'sum'#: Aggregation used for the default metric: sum, mean, min, max, or first.

fio_args() → dict[str, Any][source]#: Return only arguments intended for the fio CLI.

Test Definition#

class cloudai.workloads.fio.fio.FioTestDefinition(*, name: str, description: str, test_template_name: str, cmd_args: ~cloudai.workloads.fio.fio.FioCmdArgs, dse_excluded_args: list[str] = <factory>, extra_env_vars: dict[str, str | ~typing.List[str]] = {}, extra_cmd_args: dict[str, str] = {}, extra_container_mounts: list[str] = [], git_repos: list[~cloudai._core.installables.git_repo.GitRepo] = [], nsys: ~cloudai.models.workload.NsysConfiguration | None = None, predictor: ~cloudai.models.workload.PredictorConfig | None = None, training_report: ~cloudai.models.workload.TrainingReportConfig | None = None, agent: str = 'grid_search', agent_steps: int = 1, agent_metrics: list[str] = ['default'], agent_reward_function: str = 'inverse', agent_config: dict[str, ~typing.Any] | None = None, env_params: dict[str, ~cloudai.configurator.env_params.EnvParamSpec] = <factory>)[source]#

Bases: TestDefinition

Test definition for fio.

property is_domain_randomization_enabled: bool#

at least one env_params annotation.

Type:: Whether the config declares domain randomization

is_dse_excluded_arg(path: str) → bool#: Return whether a dot-separated cmd_args path should be ignored by DSE.

is_env_sampled(cmd_args_path: str) → bool#: Whether a cmd_args field is env-sampled (env draws it per trial, not the agent).

validator validate_env_params » all fields#

Validate env_params annotations against cmd_args.

env_params is an annotation: each key names a cmd_args field whose value is the candidate set (the single source of truth), and the entry carries only how to sample. So each key must name a real cmd_args field whose value is a candidate list; a scalar is already fixed, so annotating it is a meaningless label and is rejected here. When weights are declared, the list needs >= 2 values and the weights must align 1:1 with it. Sampling, persistence, the per-trial cmd_args overlay, and the cache key all live in CloudAIGymEnv; keeping this shape check in core lets the overlay stay agent- and workload-agnostic rather than re-implemented per workload.