nvalchemi.dynamics.hooks.NaNDetectorHook#
- class nvalchemi.dynamics.hooks.NaNDetectorHook(frequency=1, extra_keys=None)[source]#
Detect NaN or Inf values in model outputs and raise immediately.
After each model forward pass, this hook inspects
batch.forcesandbatch.energiesfor non-finite values (NaNorInf). If any are found, it raises aRuntimeErrorwith diagnostic information including:Which field(s) contain non-finite values (forces, energies, or both).
The graph indices of affected samples (via
batch.batch).The current
dynamics.step_count.The number of non-finite elements.
This early detection prevents corrupted state from propagating through the integrator, which would produce meaningless trajectories and waste compute. It is especially useful when running ML potentials on geometries outside their training distribution, where force predictions can diverge without warning.
The hook can optionally check additional tensor keys beyond forces and energies by specifying
extra_keys.- Parameters:
frequency (int, optional) – Check every
frequencysteps. Default1(every step). Setting this higher reduces overhead at the cost of delayed detection.extra_keys (list[str] | None, optional) – Additional batch attribute names to check for non-finite values (e.g.
["stresses", "velocities"]). Each key must be a tensor attribute onBatch. DefaultNone(check only forces and energies).
- extra_keys#
Additional keys to check beyond forces and energies.
- Type:
list[str]
- frequency#
Check frequency in steps.
- Type:
int
- stage#
Fixed to
AFTER_COMPUTE.- Type:
Examples
>>> from nvalchemi.dynamics.hooks import NaNDetectorHook >>> hook = NaNDetectorHook() # check every step >>> dynamics = DemoDynamics(model=model, n_steps=1000, dt=0.5, hooks=[hook]) >>> dynamics.run(batch)
Check additional fields:
>>> hook = NaNDetectorHook(extra_keys=["stresses", "velocities"])
Notes
The check uses
torch.isfiniteand operates on the full concatenated tensors, so the overhead scales with total atom count rather than batch size.For production runs where overhead is a concern, set
frequency=10orfrequency=100to amortize the cost.Consider pairing with
MaxForceClampHookas a first line of defense — clamping prevents many NaN-producing integration failures.
- __init__(frequency=1, extra_keys=None)[source]#
- Parameters:
frequency (int)
extra_keys (list[str] | None)
- Return type:
None
Methods
__init__([frequency, extra_keys])Attributes