nvalchemi.dynamics.hooks.NaNDetectorHook#

class nvalchemi.dynamics.hooks.NaNDetectorHook(frequency=1, extra_keys=None)[source]#

Detect NaN or Inf values in model outputs and raise immediately.

After each model forward pass, this hook inspects batch.forces and batch.energies for non-finite values (NaN or Inf). If any are found, it raises a RuntimeError with diagnostic information including:

  • Which field(s) contain non-finite values (forces, energies, or both).

  • The graph indices of affected samples (via batch.batch).

  • The current dynamics.step_count.

  • The number of non-finite elements.

This early detection prevents corrupted state from propagating through the integrator, which would produce meaningless trajectories and waste compute. It is especially useful when running ML potentials on geometries outside their training distribution, where force predictions can diverge without warning.

The hook can optionally check additional tensor keys beyond forces and energies by specifying extra_keys.

Parameters:
  • frequency (int, optional) – Check every frequency steps. Default 1 (every step). Setting this higher reduces overhead at the cost of delayed detection.

  • extra_keys (list[str] | None, optional) – Additional batch attribute names to check for non-finite values (e.g. ["stresses", "velocities"]). Each key must be a tensor attribute on Batch. Default None (check only forces and energies).

extra_keys#

Additional keys to check beyond forces and energies.

Type:

list[str]

frequency#

Check frequency in steps.

Type:

int

stage#

Fixed to AFTER_COMPUTE.

Type:

HookStageEnum

Examples

>>> from nvalchemi.dynamics.hooks import NaNDetectorHook
>>> hook = NaNDetectorHook()  # check every step
>>> dynamics = DemoDynamics(model=model, n_steps=1000, dt=0.5, hooks=[hook])
>>> dynamics.run(batch)

Check additional fields:

>>> hook = NaNDetectorHook(extra_keys=["stresses", "velocities"])

Notes

  • The check uses torch.isfinite and operates on the full concatenated tensors, so the overhead scales with total atom count rather than batch size.

  • For production runs where overhead is a concern, set frequency=10 or frequency=100 to amortize the cost.

  • Consider pairing with MaxForceClampHook as a first line of defense — clamping prevents many NaN-producing integration failures.

__init__(frequency=1, extra_keys=None)[source]#
Parameters:
  • frequency (int)

  • extra_keys (list[str] | None)

Return type:

None

Methods

__init__([frequency, extra_keys])

Attributes

stage