Benchmarking#
PhysicsNeMo Curator uses a three-tier benchmarking strategy:
Tool |
Purpose |
Scope |
|---|---|---|
pytest-benchmark |
Fast per-PR regression checks in CI |
Current commit only |
ASV (airspeed velocity) |
Long-term historical performance tracking |
Across git history |
Criterion |
Rust micro-benchmarks |
Rust core library |
Quick Start#
# Install dev dependencies (includes asv + pytest-benchmark)
make install
# Build the native extension
make develop
# Run pytest-benchmark (fast, current code)
make bench
# Run ASV on the current commit
make asv-run
# Preview the ASV dashboard
make asv-preview
pytest-benchmark (CI Benchmarks)#
pytest-benchmark runs inside the normal test suite and is designed for fast
per-PR checks. Benchmark tests live in test/ and use the benchmark fixture
provided by pytest-benchmark.
import pytest
@pytest.mark.benchmark
def test_pipeline_throughput(benchmark):
"""Benchmark pipeline item processing."""
from physicsnemo_curator.core.base import Pipeline
# ... setup ...
benchmark(pipeline.__getitem__, 0)
Run benchmarks:
# Run only benchmarks
uv run pytest test/ --benchmark-only
# Skip benchmarks during normal test runs
uv run pytest test/ --benchmark-skip
# Compare against saved baseline
uv run pytest test/ --benchmark-only --benchmark-compare
Results are stored as JSON in .benchmarks/.
ASV (Historical Benchmarks)#
Airspeed Velocity tracks performance across the
project’s git history and produces an interactive web dashboard. The project
uses environment_type: "existing" — benchmarks run in the current
environment rather than isolated virtualenvs, so you must build the extension
first with make develop.
Benchmark Files#
ASV benchmarks live in the benchmarks/ directory at the project root:
benchmarks/
├── __init__.py
├── _helpers.py # Shared benchmark utilities
├── asv_build.py # ASV build configuration
├── bench_atm.py # Atomic data benchmarks
├── bench_backends.py # Execution backend benchmarks
├── bench_da.py # DataArray benchmarks
└── bench_mesh.py # Mesh pipeline benchmarks
Writing ASV Benchmarks#
Benchmarks are plain Python classes/functions with magic name prefixes:
Prefix |
Measures |
|---|---|
|
Wall-clock execution time |
|
Memory footprint of returned object |
|
Peak resident memory |
|
Arbitrary numeric value |
|
Execution time in a fresh subprocess |
class TimePipelineIteration:
"""Benchmark per-item pipeline throughput."""
params = [10, 100, 1000]
param_names = ["num_items"]
def setup(self, num_items):
"""Called before each benchmark (excluded from timing)."""
self.pipeline = build_pipeline(num_items)
def time_iterate_all(self, num_items):
"""Time iterating through every item."""
for i in range(len(self.pipeline)):
self.pipeline[i]
Running ASV#
# Benchmark the current commit
make asv-run
# Dry-run (quick smoke test, no results saved)
make asv-quick
# Benchmark a range of commits
uv run asv run v0.1.0..HEAD
# Compare two revisions
make asv-compare REF1=main REF2=HEAD
# Find the commit that introduced a regression
uv run asv find v0.1.0..HEAD TimePipelineIteration.time_iterate_all
# Show results for a commit
uv run asv show HEAD
Live Dashboard#
The ASV benchmark dashboard is published automatically to GitHub Pages by the nightly CI workflow. View it at:
The dashboard updates each night with the latest results. You can also trigger
a run manually from the Actions → Benchmark tab using workflow_dispatch.
To preview locally after running benchmarks:
make asv-publish # Build static HTML from .asv/results
make asv-preview # Serve at http://localhost:8080
Configuration#
ASV is configured in asv.conf.json at the project root. Key settings:
environment_type:existing(uses the current Python environment — runmake developbefore benchmarking)benchmark_dir: Points tobenchmarks/regressions_thresholds: 5 % regression triggers a warningAll ASV artifacts (envs, results, HTML) are stored under
.asv/and gitignored
Criterion (Rust Benchmarks)#
Rust micro-benchmarks use Criterion.rs
and live in src/rust/benches/:
# Run Rust benchmarks
cargo bench --manifest-path src/rust/Cargo.toml
Criterion produces HTML reports in src/rust/target/criterion/.
Make Targets Reference#
Target |
Description |
|---|---|
|
pytest-benchmark + Criterion (fast, current code) |
|
ASV benchmark on HEAD (saves results) |
|
ASV dry-run (no results saved) |
|
Build ASV HTML dashboard |
|
Serve ASV dashboard locally |
|
Compare two git revisions |