nvalchemi.data.AtomicData#

class nvalchemi.data.AtomicData(*, atomic_numbers, positions, atomic_masses=None, atom_categories=None, edge_index=None, shifts=None, unit_shifts=None, cell=None, pbc=None, forces=None, energies=None, stresses=None, virials=None, dipoles=None, node_charges=None, graph_charges=None, node_attrs=None, node_alpha_spins=None, node_beta_spins=None, graph_spins=None, graph_alpha_spins=None, node_embeddings=None, edge_embeddings=None, graph_embeddings=None, velocities=None, momenta=None, kinetic_energies=None, info=<factory>, **extra_data)[source]#

Atomic data structure for molecular systems.

Represents molecular systems as graphs with atomic properties and interactions. Uses Pydantic for validation and serialization, with DataMixin for graph functionality.

Parameters:
  • atomic_numbers (Annotated[Int64[Tensor, 'V'], FieldInfo(annotation=NoneType, required=True, description='Atomic numbers for each node [n_nodes]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • positions (Annotated[Float[Tensor, 'V 3'], FieldInfo(annotation=NoneType, required=True, description='Cartesian coordinates for each atom [n_nodes, 3]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • atomic_masses (Annotated[Float[Tensor, 'V'] | None, FieldInfo(annotation=NoneType, required=True, description='Atomic masses [n_nodes]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • atom_categories (Annotated[list[AtomCategory] | Integer[Tensor, 'V'] | None, FieldInfo(annotation=NoneType, required=True, description='Atom categorical index, based on _typing.AtomCategory Enum [n_nodes]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • edge_index (Annotated[Int64[Tensor, '2 E'] | None, FieldInfo(annotation=NoneType, required=True, description='Edge index [2, n_edges]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • shifts (Annotated[Float[Tensor, 'E 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Shifts for each edge [n_edges, 3]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • unit_shifts (Annotated[Float[Tensor, 'E 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Additional shifts for each edge [n_edges, 3]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • cell (Annotated[Float[Tensor, 'B 3 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Unit cell vectors [3, 3]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • pbc (Annotated[Bool[Tensor, 'B 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Boolean tensor indicating periodic boundary conditions along each dimension'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • forces (Annotated[Float[Tensor, 'V 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Atomic forces [n_nodes, 3]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • energies (Annotated[Float[Tensor, 'B 1'] | None, FieldInfo(annotation=NoneType, required=True, description='Total energies [1]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • stresses (Annotated[Float[Tensor, 'B 3 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Stresses tensor [1, 3, 3]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • virials (Annotated[Float[Tensor, 'B 3 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Virial tensor [1, 3, 3]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • dipoles (Annotated[Float[Tensor, 'B 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Dipole moments of the system.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • node_charges (Annotated[Float[Tensor, 'V 1'] | None, FieldInfo(annotation=NoneType, required=True, description='Partial atomic charges [n_nodes]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • graph_charges (Annotated[Float[Tensor, 'B 1'] | None, FieldInfo(annotation=NoneType, required=True, description='Total system charges [1]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • node_attrs (Annotated[Float[Tensor, 'V A'] | None, FieldInfo(annotation=NoneType, required=True, description='Node attributes [n_nodes, n_node_attrs]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • node_alpha_spins (Annotated[Float[Tensor, 'V 1'] | None, FieldInfo(annotation=NoneType, required=True, description='Alpha spins for each atom, [n_nodes, 1]. Use this field for closed-shell spins.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • node_beta_spins (Annotated[Float[Tensor, 'V 1'] | None, FieldInfo(annotation=NoneType, required=True, description='Beta spins for each atom, [n_nodes, 1]. For restricted spin, use ``node_alpha_spins`` instead.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • graph_spins (Annotated[Float[Tensor, 'B 1'] | None, FieldInfo(annotation=NoneType, required=True, description='Spin or multiplicity value for the system, [1, 1]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • graph_alpha_spins (Annotated[Float[Tensor, 'B 1'] | None, FieldInfo(annotation=NoneType, required=True, description='Alpha spins for the entire graph, [1, 1]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • node_embeddings (Annotated[Float[Tensor, 'V H'] | None, FieldInfo(annotation=NoneType, required=True, description='Embeddings for each node within the batch/graph.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • edge_embeddings (Annotated[Float[Tensor, 'E H'] | None, FieldInfo(annotation=NoneType, required=True, description='Embeddings for each edge within the batch/graph.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • graph_embeddings (Annotated[Float[Tensor, 'B H'] | None, FieldInfo(annotation=NoneType, required=True, description='Embeddings for the entire graph/graphs within a batch.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • velocities (Annotated[Float[Tensor, 'V 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Atomic velocities [n_nodes, 3], in units set by positions.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • momenta (Annotated[Float[Tensor, 'V 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Atomic momenta [n_nodes, 3], in units set by positions.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • kinetic_energies (Annotated[Float[Tensor, 'V 1'] | None, FieldInfo(annotation=NoneType, required=True, description='Per-atom kinetic energies [n_nodes, 1], with the same units as energies.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])

  • info (dict[str, Tensor])

  • extra_data (Any)

atomic_numbers#

Atomic numbers of each atom [n_nodes]

Type:

torch.Tensor

positions#

Cartesian coordinates [n_nodes, 3]

Type:

torch.Tensor

atomic_masses#

Atomic masses [n_nodes]

Type:

torch.Tensor

edge_index#

Edge index [2, n_edges]

Type:

torch.Tensor

node_attrs#

Node attributes [n_nodes, n_node_feats]

Type:

torch.Tensor

shifts#

Shifts for each edge [n_edges, 3]

Type:

torch.Tensor

unit_shifts#

Additional shifts for each edge [n_edges, 3]

Type:

torch.Tensor

cell#

Unit cell vectors [3, 3]

Type:

torch.Tensor

pbc#

Periodic boundary conditions [3]

Type:

torch.Tensor

forces#

Atomic forces [n_nodes, 3]

Type:

torch.Tensor

energies#

Total energies [1]

Type:

torch.Tensor

stresses#

Stress tensor [1, 3, 3]

Type:

torch.Tensor

virials#

Virial tensor [1, 3, 3]

Type:

torch.Tensor

dipoles#

Dipole moment [1, 3]

Type:

torch.Tensor

node_charges#

Partial atomic charges [n_nodes]

Type:

torch.Tensor

graph_charges#

Total system charge [1]

Type:

torch.Tensor

info#

Additional information about the system

Type:

dict

add_edge_property(key, value)[source]#

Add an edge property to the graph.

Parameters:
  • key (str)

  • value (Any)

Return type:

None

add_node_property(key, value, node_dim=0)[source]#

Add a node property to the graph.

Parameters:
  • key (str)

  • value (Tensor)

  • node_dim (int)

Return type:

None

add_system_property(key, value)[source]#

Add a system property to the graph.

Parameters:
  • key (str)

  • value (Any)

Return type:

None

check_edge_consistency()[source]#

Validate that all edge-level properties have consistent atom counts.

This validator runs after all field validators and checks that any edge-level property that is set has the same number of edges as edge_index.

Returns:

Returns self if validation passes.

Return type:

Self

Raises:

ValueError – If any edge-level property has an inconsistent number of edges.

check_fp_dtype_consistency()[source]#

Ensures all floating point tensors are at the same precision as the positions tensor.

Return type:

AtomicData

check_node_consistency()[source]#

Validate that all node-level properties have consistent atom counts.

This validator runs after all field validators and checks that any node-level property that is set has the same number of nodes as atomic_numbers.

Returns:

Returns self if validation passes.

Return type:

Self

Raises:

ValueError – If any node-level property has an inconsistent number of nodes.

property chemical_hash: str#

Generate a unique hash for the chemical system using the blake2s hashing algorithm.

The hash is unique to a given atomic composition and structure, invariant to the ordering of atoms in the data. The hash also differentiates between periodic and non-periodic systems, and for the former, lattice vectors and directions of periodicity.

Returns:

A blake2s hash string representing the chemical system.

Return type:

str

Notes

The hash is generated by: 1. Sorting atoms by atomic number to ensure invariance to atom ordering 2. Including atomic numbers and positions of sorted atoms 3. Including periodic boundary conditions and cell parameters if present 4. Computing a BLAKE2s hash of the formatted string representation

property device: device#

Get the device of the positions tensor.

property dtype: dtype#

Get the dtype of the positions tensor.

property edge_properties: dict[str, Any]#

Get the edge properties of the graph.

enforce_device_consistency()[source]#

Enforces all tensors to be on the same device.

In instances where the devices of atomic numbers and positions are different, we will try and promote them to offload over host CPU.

Return type:

AtomicData

classmethod from_atoms(atoms, energy_key='energy', forces_key='forces', stress_key='stress', virials_key='virials', dipole_key='dipole', charges_key='charges', device='cpu', dtype=torch.float32, z_table=None)[source]#

Creates AtomicData from a data structure.

Parameters:
  • atoms (Any) – The data structure to convert to AtomicData.

  • energy_key (str) – The key to get the energy from the data structure.

  • forces_key (str) – The key to get the forces from the data structure.

  • stress_key (str) – The key to get the stress from the data structure.

  • virials_key (str) – The key to get the virials from the data structure.

  • dipole_key (str) – The key to get the dipole from the data structure.

  • charges_key (str) – The key to get the charges from the data structure.

  • device (str | torch.device) – The device to convert the data to.

  • dtype (torch.dtype) – The dtype to convert the data to.

  • z_table (AtomicNumberTable | None) – The atomic number table to use for the atomic numbers.

Return type:

AtomicData

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'validate_assignment': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

property node_properties: dict[str, Any]#

Get the node properties of the graph.

property num_edges: int#

Return the number of edges in the graph.

property num_nodes: int#

Return the number of nodes in the graph.

property system_properties: dict[str, Any]#

Get the system properties of the graph.

use_default_categories()[source]#

Check to make sure categories for atoms are set.

In the case that a list is passed, which should be validated by pydantic, we will convert it to a tensor.

Return type:

AtomicData

use_default_masses()[source]#

If no atomic masses are set, automatically fill in with default masses from periodictable.

Returns:

Returns self if validation passes.

Return type:

Self

use_default_velocities()[source]#

If no velocities are set, initialize as zeros with proper shape and dtype.

Returns:

Returns self if validation passes.

Return type:

Self