Introduction to ALCHEMI Toolkit-Ops#
ALCHEMI Toolkit-Ops provides GPU-accelerated algorithms for atomistic simulations, computational chemistry, and graph neural networks. Built on NVIDIA Warp, it delivers high-performance primitives for neighbor list construction, dispersion corrections, and related operations on both single systems and batched datasets.
Note
If you need a quick way to get started, see the User Guide for installation and a working code example.
When to Use ALCHEMI Toolkit-Ops#
This package is designed for GPU-accelerated workflows in computational chemistry and machine learning. Common use cases include:
Density Functional Theory : Add DFT-D3(BJ) dispersion corrections for improved accuracy in weakly bound systems. Batch calculations enable high-throughput processing of molecular databases, and accurate forces support geometry optimization.
Graph Neural Networks : Generate edge connectivity with neighbor lists in COO format. Batch processing supports high-throughput training on molecular datasets with heterogeneous system sizes.
Molecular Dynamics : Construct neighbor lists for short-range interactions and dispersion forces. Skin distance optimization and rebuild detection reduce expensive list reconstructions during long simulations.
High-Throughput Screening : Process large molecular databases with batch computation. Evaluate energies and forces for conformer analysis, virtual screening, and property prediction.
Method Development : Access low-level Warp kernels for custom algorithm development with profiling and memory estimation utilities.
Design Principles#
ALCHEMI Toolkit-Ops prioritizes performance, correctness, and usability:
All algorithms are GPU-accelerated via NVIDIA Warp kernels
High-level APIs handle algorithm selection automatically; low-level kernels remain accessible for custom workflows
Outputs are PyTorch tensors with full
torch.compilecompatibilityBoth dense (neighbor matrix) and sparse (COO) formats are supported
Core Components#
Neighbor Finding#
The neighbor_list() function provides a unified
interface that automatically selects between algorithms based on system size
and whether batch indices are provided. It returns either a dense neighbor
matrix or sparse COO list, with consistent behavior across all modes.
Tip
The crossover point between cell list and naive algorithms depends on system density and cutoff radius. We encourage users to benchmark the performance on their workload and develop an intuition for which algorithm to use under what circumstances.
Choose the right parameters based on your use case:
Single system with >5000 atoms
from nvalchemiops.neighborlist import neighbor_list
neighbor_matrix, num_neighbors, shifts = neighbor_list(
positions, cutoff, cell=cell, pbc=pbc, method="cell_list"
)
Dispatches to cell_list() — O(N) algorithm
using spatial decomposition.
Single system with <5000 atoms
from nvalchemiops.neighborlist import neighbor_list
neighbor_matrix, num_neighbors, shifts = neighbor_list(
positions, cutoff, cell=cell, pbc=pbc, method="naive"
)
Dispatches to naive_neighbor_list() — O(N²)
algorithm with lower overhead.
Multiple systems with >5000 atoms each
from nvalchemiops.neighborlist import neighbor_list
neighbor_matrix, num_neighbors, shifts = neighbor_list(
positions, cutoff, cell=cells, pbc=pbc,
batch_idx=batch_idx, method="batch_cell_list"
)
Dispatches to batch_cell_list() — O(N)
algorithm for heterogeneous batches.
Multiple systems with <5000 atoms each
from nvalchemiops.neighborlist import neighbor_list
neighbor_matrix, num_neighbors, shifts = neighbor_list(
positions, cutoff, cell=cells, pbc=pbc,
batch_idx=batch_idx, method="batch_naive"
)
Dispatches to batch_naive_neighbor_list() —
O(N²) algorithm for batched small systems.
Note
When method is not specified, neighbor_list automatically selects based on
system size (≥5000 atoms → cell list) and whether batch_idx is provided.
For advanced workflows, build_cell_list() and
query_cell_list() allow caching spatial data
structures between queries. Dual-cutoff variants are available for multi-range
potentials.
Rebuild Detection#
For molecular dynamics workflows,
cell_list_needs_rebuild() and related functions
minimize expensive cell list reconstructions by detecting when atoms have moved
beyond a skin distance threshold.
Dispersion Corrections#
The dftd3()
function computes DFT-D3(BJ) dispersion energy and forces with
environment-dependent C6 coefficients based on coordination numbers. It supports
both neighbor formats, periodic and non-periodic systems, and batched computation.
Tip
The damping parameters (a1, a2, s8) are functional-dependent. Common values
can be found in the Grimme group DFT-D3 documentation.
Positions and parameters must use consistent units (atomic units recommended).
Choose the right parameters based on your system type:
Non-periodic molecular system
from nvalchemiops.interactions.dispersion.dftd3 import dftd3
from nvalchemiops.neighborlist import neighbor_list
# Build neighbor list (positions in Bohr)
neighbors, neighbor_ptr, _ = neighbor_list(
positions, cutoff=40.0, return_neighbor_list=True
)
# Compute D3 correction (PBE functional)
energy, forces, coord_num = dftd3(
positions, numbers, neighbor_list=neighbors,
a1=0.4289, a2=4.4407, s8=0.7875, d3_params=d3_params
)
Single periodic system (crystal, surface)
from nvalchemiops.interactions.dispersion.dftd3 import dftd3
from nvalchemiops.neighborlist import neighbor_list
# Build neighbor list with PBC
neighbors, neighbor_ptr, shifts = neighbor_list(
positions, cutoff=40.0, cell=cell, pbc=pbc, return_neighbor_list=True
)
# Compute D3 correction with periodic shifts
energy, forces, coord_num = dftd3(
positions, numbers, neighbor_list=neighbors,
a1=0.4289, a2=4.4407, s8=0.7875, d3_params=d3_params,
cell=cell, unit_shifts=shifts
)
Multiple systems processed simultaneously
from nvalchemiops.interactions.dispersion.dftd3 import dftd3
from nvalchemiops.neighborlist import neighbor_list
# Build batched neighbor list
neighbors, neighbor_ptr, shifts = neighbor_list(
positions, cutoff=40.0, cell=cells, pbc=pbc,
batch_idx=batch_idx, return_neighbor_list=True
)
# Compute D3 correction for all systems
energy, forces, coord_num = dftd3(
positions, numbers, neighbor_list=neighbors,
a1=0.4289, a2=4.4407, s8=0.7875, d3_params=d3_params,
cell=cells, unit_shifts=shifts, batch_idx=batch_idx
)
Returns per-system energies with shape (num_systems,).
Note
DFT-D3 parameters must be provided via the d3_params argument (as a
D3Parameters instance or
dict). See the dispersion documentation for parameter
setup and loading from standard reference files.
Electrostatic Interactions#
The ewald_summation() and
particle_mesh_ewald() functions
compute long-range Coulomb interactions in periodic systems. Both methods split the
slowly-converging \(1/r\) potential into real-space (short-range) and reciprocal-space
(long-range) components, with automatic parameter estimation based on target accuracy.
Tip
For systems with <5000 atoms, use Ewald summation. For larger systems, PME provides \(O(N \log N)\) scaling via FFT-accelerated reciprocal-space calculations.
Choose the right method based on your system size:
Small to medium systems (<5000 atoms)
from nvalchemiops.interactions.electrostatics import ewald_summation
from nvalchemiops.neighborlist import neighbor_list
# Build neighbor list
neighbors, _, shifts = neighbor_list(
positions, cutoff=10.0, cell=cell, pbc=pbc, return_neighbor_list=True
)
# Compute electrostatics (parameters estimated automatically)
energies, forces = ewald_summation(
positions, charges, cell, neighbor_list=neighbors,
neighbor_shifts=shifts, accuracy=1e-6
)
Uses explicit k-vector summation in reciprocal space — \(O(N^2)\) scaling.
Large systems (>5000 atoms)
from nvalchemiops.interactions.electrostatics import particle_mesh_ewald
from nvalchemiops.neighborlist import neighbor_list
# Build neighbor list
neighbors, _, shifts = neighbor_list(
positions, cutoff=10.0, cell=cell, pbc=pbc, return_neighbor_list=True
)
# Compute electrostatics with FFT acceleration
energies, forces = particle_mesh_ewald(
positions, charges, cell, neighbor_list=neighbors,
neighbor_shifts=shifts, accuracy=1e-6
)
Uses FFT-based reciprocal-space calculation — \(O(N \log N)\) scaling.
Multiple systems processed simultaneously
from nvalchemiops.interactions.electrostatics import ewald_summation
from nvalchemiops.neighborlist import neighbor_list
# Build batched neighbor list
neighbors, _, shifts = neighbor_list(
positions, cutoff=10.0, cell=cells, pbc=pbc,
batch_idx=batch_idx, return_neighbor_list=True
)
# Batched electrostatics
energies, forces = ewald_summation(
positions, charges, cell=cells, neighbor_list=neighbors,
neighbor_shifts=shifts, batch_idx=batch_idx, accuracy=1e-6
)
Returns per-atom energies; sum by system using batch_idx.
Ecosystem Integration#
ALCHEMI Toolkit-Ops integrates with the scientific Python ecosystem:
PyTorch#
All inputs and outputs are PyTorch tensors with automatic CPU/GPU handling.
Custom operators are registered for torch.compile compatibility, and tensors
maintain gradients for automatic differentiation where applicable.
NVIDIA Tools#
Kernels are implemented in NVIDIA Warp with GPU memory layouts optimized for NVIDIA architectures. Standard profiling tools like Nsight Systems work out of the box.
Computational Chemistry#
DFT-D3 parameters follow the standard format from the Grimme group, ensuring compatibility with established workflows.
What’s Next?#
Follow the installation guide to set up your environment
Read the neighbor list documentation for spatial algorithm details
Explore DFT-D3 dispersion corrections for energy and force calculations
Learn about electrostatic interactions for Ewald summation and PME calculations
Check the
examples/directory for complete working scripts