Introduction to ALCHEMI Toolkit-Ops#

ALCHEMI Toolkit-Ops provides GPU-accelerated algorithms for atomistic simulations, computational chemistry, and graph neural networks. Built on NVIDIA Warp, it delivers high-performance primitives for neighbor list construction, dispersion corrections, and related operations on both single systems and batched datasets.

Note

If you need a quick way to get started, see the User Guide for installation and a working code example.

When to Use ALCHEMI Toolkit-Ops#

This package is designed for GPU-accelerated workflows in computational chemistry and machine learning. Common use cases include:

Density Functional Theory : Add DFT-D3(BJ) dispersion corrections for improved accuracy in weakly bound systems. Batch calculations enable high-throughput processing of molecular databases, and accurate forces support geometry optimization.

Graph Neural Networks : Generate edge connectivity with neighbor lists in COO format. Batch processing supports high-throughput training on molecular datasets with heterogeneous system sizes.

Molecular Dynamics : Construct neighbor lists for short-range interactions and dispersion forces. Skin distance optimization and rebuild detection reduce expensive list reconstructions during long simulations.

High-Throughput Screening : Process large molecular databases with batch computation. Evaluate energies and forces for conformer analysis, virtual screening, and property prediction.

Method Development : Access low-level Warp kernels for custom algorithm development with profiling and memory estimation utilities.

Design Principles#

ALCHEMI Toolkit-Ops prioritizes performance, correctness, and usability:

  1. All algorithms are GPU-accelerated via NVIDIA Warp kernels

  2. High-level APIs handle algorithm selection automatically; low-level kernels remain accessible for custom workflows

  3. Outputs are PyTorch tensors with full torch.compile compatibility

  4. Both dense (neighbor matrix) and sparse (COO) formats are supported

Core Components#

Neighbor Finding#

The neighbor_list() function provides a unified interface that automatically selects between algorithms based on system size and whether batch indices are provided. It returns either a dense neighbor matrix or sparse COO list, with consistent behavior across all modes.

Tip

The crossover point between cell list and naive algorithms depends on system density and cutoff radius. We encourage users to benchmark the performance on their workload and develop an intuition for which algorithm to use under what circumstances.

Choose the right parameters based on your use case:

Single system with >5000 atoms

from nvalchemiops.neighborlist import neighbor_list

neighbor_matrix, num_neighbors, shifts = neighbor_list(
    positions, cutoff, cell=cell, pbc=pbc, method="cell_list"
)

Dispatches to cell_list() — O(N) algorithm using spatial decomposition.

Single system with <5000 atoms

from nvalchemiops.neighborlist import neighbor_list

neighbor_matrix, num_neighbors, shifts = neighbor_list(
    positions, cutoff, cell=cell, pbc=pbc, method="naive"
)

Dispatches to naive_neighbor_list() — O(N²) algorithm with lower overhead.

Multiple systems with >5000 atoms each

from nvalchemiops.neighborlist import neighbor_list

neighbor_matrix, num_neighbors, shifts = neighbor_list(
    positions, cutoff, cell=cells, pbc=pbc,
    batch_idx=batch_idx, method="batch_cell_list"
)

Dispatches to batch_cell_list() — O(N) algorithm for heterogeneous batches.

Multiple systems with <5000 atoms each

from nvalchemiops.neighborlist import neighbor_list

neighbor_matrix, num_neighbors, shifts = neighbor_list(
    positions, cutoff, cell=cells, pbc=pbc,
    batch_idx=batch_idx, method="batch_naive"
)

Dispatches to batch_naive_neighbor_list() — O(N²) algorithm for batched small systems.

Note

When method is not specified, neighbor_list automatically selects based on system size (≥5000 atoms → cell list) and whether batch_idx is provided.

For advanced workflows, build_cell_list() and query_cell_list() allow caching spatial data structures between queries. Dual-cutoff variants are available for multi-range potentials.

Rebuild Detection#

For molecular dynamics workflows, cell_list_needs_rebuild() and related functions minimize expensive cell list reconstructions by detecting when atoms have moved beyond a skin distance threshold.

Dispersion Corrections#

The dftd3() function computes DFT-D3(BJ) dispersion energy and forces with environment-dependent C6 coefficients based on coordination numbers. It supports both neighbor formats, periodic and non-periodic systems, and batched computation.

Tip

The damping parameters (a1, a2, s8) are functional-dependent. Common values can be found in the Grimme group DFT-D3 documentation. Positions and parameters must use consistent units (atomic units recommended).

Choose the right parameters based on your system type:

Non-periodic molecular system

from nvalchemiops.interactions.dispersion.dftd3 import dftd3
from nvalchemiops.neighborlist import neighbor_list

# Build neighbor list (positions in Bohr)
neighbors, neighbor_ptr, _ = neighbor_list(
    positions, cutoff=40.0, return_neighbor_list=True
)

# Compute D3 correction (PBE functional)
energy, forces, coord_num = dftd3(
    positions, numbers, neighbor_list=neighbors,
    a1=0.4289, a2=4.4407, s8=0.7875, d3_params=d3_params
)

Single periodic system (crystal, surface)

from nvalchemiops.interactions.dispersion.dftd3 import dftd3
from nvalchemiops.neighborlist import neighbor_list

# Build neighbor list with PBC
neighbors, neighbor_ptr, shifts = neighbor_list(
    positions, cutoff=40.0, cell=cell, pbc=pbc, return_neighbor_list=True
)

# Compute D3 correction with periodic shifts
energy, forces, coord_num = dftd3(
    positions, numbers, neighbor_list=neighbors,
    a1=0.4289, a2=4.4407, s8=0.7875, d3_params=d3_params,
    cell=cell, unit_shifts=shifts
)

Multiple systems processed simultaneously

from nvalchemiops.interactions.dispersion.dftd3 import dftd3
from nvalchemiops.neighborlist import neighbor_list

# Build batched neighbor list
neighbors, neighbor_ptr, shifts = neighbor_list(
    positions, cutoff=40.0, cell=cells, pbc=pbc,
    batch_idx=batch_idx, return_neighbor_list=True
)

# Compute D3 correction for all systems
energy, forces, coord_num = dftd3(
    positions, numbers, neighbor_list=neighbors,
    a1=0.4289, a2=4.4407, s8=0.7875, d3_params=d3_params,
    cell=cells, unit_shifts=shifts, batch_idx=batch_idx
)

Returns per-system energies with shape (num_systems,).

Note

DFT-D3 parameters must be provided via the d3_params argument (as a D3Parameters instance or dict). See the dispersion documentation for parameter setup and loading from standard reference files.

Electrostatic Interactions#

The ewald_summation() and particle_mesh_ewald() functions compute long-range Coulomb interactions in periodic systems. Both methods split the slowly-converging \(1/r\) potential into real-space (short-range) and reciprocal-space (long-range) components, with automatic parameter estimation based on target accuracy.

Tip

For systems with <5000 atoms, use Ewald summation. For larger systems, PME provides \(O(N \log N)\) scaling via FFT-accelerated reciprocal-space calculations.

Choose the right method based on your system size:

Small to medium systems (<5000 atoms)

from nvalchemiops.interactions.electrostatics import ewald_summation
from nvalchemiops.neighborlist import neighbor_list

# Build neighbor list
neighbors, _, shifts = neighbor_list(
    positions, cutoff=10.0, cell=cell, pbc=pbc, return_neighbor_list=True
)

# Compute electrostatics (parameters estimated automatically)
energies, forces = ewald_summation(
    positions, charges, cell, neighbor_list=neighbors,
    neighbor_shifts=shifts, accuracy=1e-6
)

Uses explicit k-vector summation in reciprocal space — \(O(N^2)\) scaling.

Large systems (>5000 atoms)

from nvalchemiops.interactions.electrostatics import particle_mesh_ewald
from nvalchemiops.neighborlist import neighbor_list

# Build neighbor list
neighbors, _, shifts = neighbor_list(
    positions, cutoff=10.0, cell=cell, pbc=pbc, return_neighbor_list=True
)

# Compute electrostatics with FFT acceleration
energies, forces = particle_mesh_ewald(
    positions, charges, cell, neighbor_list=neighbors,
    neighbor_shifts=shifts, accuracy=1e-6
)

Uses FFT-based reciprocal-space calculation — \(O(N \log N)\) scaling.

Multiple systems processed simultaneously

from nvalchemiops.interactions.electrostatics import ewald_summation
from nvalchemiops.neighborlist import neighbor_list

# Build batched neighbor list
neighbors, _, shifts = neighbor_list(
    positions, cutoff=10.0, cell=cells, pbc=pbc,
    batch_idx=batch_idx, return_neighbor_list=True
)

# Batched electrostatics
energies, forces = ewald_summation(
    positions, charges, cell=cells, neighbor_list=neighbors,
    neighbor_shifts=shifts, batch_idx=batch_idx, accuracy=1e-6
)

Returns per-atom energies; sum by system using batch_idx.

Ecosystem Integration#

ALCHEMI Toolkit-Ops integrates with the scientific Python ecosystem:

PyTorch#

All inputs and outputs are PyTorch tensors with automatic CPU/GPU handling. Custom operators are registered for torch.compile compatibility, and tensors maintain gradients for automatic differentiation where applicable.

NVIDIA Tools#

Kernels are implemented in NVIDIA Warp with GPU memory layouts optimized for NVIDIA architectures. Standard profiling tools like Nsight Systems work out of the box.

Computational Chemistry#

DFT-D3 parameters follow the standard format from the Grimme group, ensuring compatibility with established workflows.

What’s Next?#

  1. Follow the installation guide to set up your environment

  2. Read the neighbor list documentation for spatial algorithm details

  3. Explore DFT-D3 dispersion corrections for energy and force calculations

  4. Learn about electrostatic interactions for Ewald summation and PME calculations

  5. Check the examples/ directory for complete working scripts