AIConfigurator#

This workload (test_template_name is Aiconfigurator) runs the AIConfigurator predictor using the installed aiconfigurator Python package. It is a Standalone workload (no Slurm/Kubernetes/RunAI required).

Outputs#

Each test run produces:

  • report.json: Predictor output (JSON dict of metrics and metadata)

  • stdout.txt / stderr.txt: Predictor logs

  • run_simple_predictor.sh: Repro script containing the exact executed command (useful for debugging)

Usage Example#

Test TOML example (Disaggregated mode):

name = "aiconfigurator_disagg_demo"
description = "Example AIConfigurator disaggregated predictor"
test_template_name = "Aiconfigurator"

[cmd_args]
model_name = "LLAMA3.1_70B"
system = "h200_sxm"
backend = "trtllm"
version = "0.20.0"
isl = 4000
osl = 500

  [cmd_args.disagg]
  p_tp = 1
  p_pp = 1
  p_dp = 1
  p_bs = 1
  p_workers = 1

  d_tp = 1
  d_pp = 1
  d_dp = 1
  d_bs = 8
  d_workers = 2

  prefill_correction_scale = 1.0
  decode_correction_scale = 1.0

Test TOML example (Aggregated/IFB mode):

name = "aiconfigurator_agg_demo"
description = "Example AIConfigurator aggregated predictor"
test_template_name = "Aiconfigurator"

[cmd_args]
model_name = "LLAMA3.1_70B"
system = "h200_sxm"
backend = "trtllm"
version = "0.20.0"
isl = 4000
osl = 500

  [cmd_args.agg]
  batch_size = 8
  ctx_tokens = 16
  tp = 1
  pp = 1
  dp = 1

Running#

uv run cloudai run --system-config conf/common/system/standalone_system.toml \
   --tests-dir conf/experimental/aiconfigurator/test \
   --test-scenario conf/experimental/aiconfigurator/test_scenario/aiconfigurator_disagg.toml

API Documentation#

Command Arguments#

class cloudai.workloads.aiconfig.aiconfigurator.AiconfiguratorCmdArgs(
*,
python_executable: str = 'python',
model_name: str,
system: str,
backend: str = 'trtllm',
version: str = '0.20.0',
isl: int | List[int],
osl: int | List[int],
agg: Agg | None = None,
disagg: Disagg | None = None,
**extra_data: Any,
)[source]#

Bases: CmdArgs

Command arguments for Aiconfigurator workload with nested agg/disagg configs.

python_executable: str#
system: str#
backend: str#
version: str#
isl: int | List[int]#
osl: int | List[int]#
agg: Agg | None#
disagg: Disagg | None#

Test Definition#

class cloudai.workloads.aiconfig.aiconfigurator.AiconfiguratorTestDefinition(
*,
name: str,
description: str,
test_template_name: str,
cmd_args: AiconfiguratorCmdArgs,
extra_env_vars: dict[str, str | List[str]] = {},
extra_cmd_args: dict[str, str] = {},
extra_container_mounts: list[str] = [],
git_repos: list[GitRepo] = [],
nsys: NsysConfiguration | None = None,
predictor: PredictorConfig | None = None,
agent: str = 'grid_search',
agent_steps: int = 1,
agent_metrics: list[str] = ['default'],
agent_reward_function: str = 'inverse',
)[source]#

Bases: TestDefinition

Test object for running Aiconfigurator predictor as a workload.

cmd_args: AiconfiguratorCmdArgs#
property installables: list[Installable]#

Command Generation Strategy (Standalone)#

class cloudai.workloads.aiconfig.standalone_command_gen_strategy.AiconfiguratorStandaloneCommandGenStrategy(
system: System,
test_run: TestRun,
)[source]#

Bases: CommandGenStrategy

Generate a standalone command that invokes the Aiconfigurator predictor and writes JSON output.

store_test_run() None[source]#

Store the test run information in output folder.

Only at command generation time, CloudAI has all the information to store the test run.

gen_exec_command() str[source]#

Generate the execution command for a test based on the given parameters.

Returns:

The generated execution command.

Return type:

str

Report Generation Strategy#

class cloudai.workloads.aiconfig.report_generation_strategy.AiconfiguratorReportGenerationStrategy(
system: System,
tr: TestRun,
)[source]#

Bases: ReportGenerationStrategy

Generate metrics from Aiconfigurator predictor outputs.

metrics: ClassVar[list[str]] = ['default', 'ttft_ms', 'tpot_ms', 'tokens_per_s_per_gpu', 'tokens_per_s_per_user']#
can_handle_directory() bool[source]#
generate_report() None[source]#
get_metric(
metric: str,
) float[source]#