Benchmarks#

Performance benchmarks for ALCHEMI Toolkit-Ops kernels. Currently, results are static and cached but we intend to evolve to CI-generated benchmark results gradually to cover different NVIDIA architectures, benchmark systems, and so on.

Available Benchmarks#

About These Benchmarks#

Benchmarks are intended to be indicative of nvalchemiops performance under a specific set of criteria; actual performance may differ depending on a number of factors including but not limited to structure/system topology, GPU architecture, driver and firmware versions.

Benchmark Methodology#

All benchmarks follow these principles:

  • Tensor allocation excluded: Only relevant kernel execution time is measured, i.e. excluding neighbor lists and preprocessing if they are not part of the benchmark.

  • Warm-up runs: Multiple warm-up iterations to ensure kernels compile overhead is removed, and that noise from cache effects are minimized.

  • Statistical sampling: Multiple timing runs with median time, maximum memory utilization, and throughput aggregated for reporting.

  • Error handling: OOM results are included.

  • Consistent inputs: Same cutoff, lattice type, and parameters across runs