Benchmarks#
Performance benchmarks for ALCHEMI Toolkit-Ops kernels. Currently, results are static and cached but we intend to evolve to CI-generated benchmark results gradually to cover different NVIDIA architectures, benchmark systems, and so on.
Available Benchmarks#
About These Benchmarks#
Benchmarks are intended to be indicative of nvalchemiops performance under
a specific set of criteria; actual performance may differ depending
on a number of factors including but not limited to structure/system
topology, GPU architecture, driver and firmware versions.
Benchmark Methodology#
All benchmarks follow these principles:
Tensor allocation excluded: Only relevant kernel execution time is measured, i.e. excluding neighbor lists and preprocessing if they are not part of the benchmark.
Warm-up runs: Multiple warm-up iterations to ensure kernels compile overhead is removed, and that noise from cache effects are minimized.
Statistical sampling: Multiple timing runs with median time, maximum memory utilization, and throughput aggregated for reporting.
Error handling: OOM results are included.
Consistent inputs: Same cutoff, lattice type, and parameters across runs