nvalchemi.data.AtomicData#
- class nvalchemi.data.AtomicData(*, atomic_numbers, positions, atomic_masses=None, atom_categories=None, edge_index=None, shifts=None, unit_shifts=None, cell=None, pbc=None, forces=None, energies=None, stresses=None, virials=None, dipoles=None, node_charges=None, graph_charges=None, node_attrs=None, node_alpha_spins=None, node_beta_spins=None, graph_spins=None, graph_alpha_spins=None, node_embeddings=None, edge_embeddings=None, graph_embeddings=None, velocities=None, momenta=None, kinetic_energies=None, info=<factory>, **extra_data)[source]#
Atomic data structure for molecular systems.
Represents molecular systems as graphs with atomic properties and interactions. Uses Pydantic for validation and serialization, with DataMixin for graph functionality.
- Parameters:
atomic_numbers (Annotated[Int64[Tensor, 'V'], FieldInfo(annotation=NoneType, required=True, description='Atomic numbers for each node [n_nodes]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
positions (Annotated[Float[Tensor, 'V 3'], FieldInfo(annotation=NoneType, required=True, description='Cartesian coordinates for each atom [n_nodes, 3]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
atomic_masses (Annotated[Float[Tensor, 'V'] | None, FieldInfo(annotation=NoneType, required=True, description='Atomic masses [n_nodes]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
atom_categories (Annotated[list[AtomCategory] | Integer[Tensor, 'V'] | None, FieldInfo(annotation=NoneType, required=True, description='Atom categorical index, based on _typing.AtomCategory Enum [n_nodes]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
edge_index (Annotated[Int64[Tensor, '2 E'] | None, FieldInfo(annotation=NoneType, required=True, description='Edge index [2, n_edges]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
shifts (Annotated[Float[Tensor, 'E 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Shifts for each edge [n_edges, 3]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
unit_shifts (Annotated[Float[Tensor, 'E 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Additional shifts for each edge [n_edges, 3]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
cell (Annotated[Float[Tensor, 'B 3 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Unit cell vectors [3, 3]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
pbc (Annotated[Bool[Tensor, 'B 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Boolean tensor indicating periodic boundary conditions along each dimension'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
forces (Annotated[Float[Tensor, 'V 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Atomic forces [n_nodes, 3]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
energies (Annotated[Float[Tensor, 'B 1'] | None, FieldInfo(annotation=NoneType, required=True, description='Total energies [1]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
stresses (Annotated[Float[Tensor, 'B 3 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Stresses tensor [1, 3, 3]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
virials (Annotated[Float[Tensor, 'B 3 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Virial tensor [1, 3, 3]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
dipoles (Annotated[Float[Tensor, 'B 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Dipole moments of the system.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
node_charges (Annotated[Float[Tensor, 'V 1'] | None, FieldInfo(annotation=NoneType, required=True, description='Partial atomic charges [n_nodes]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
graph_charges (Annotated[Float[Tensor, 'B 1'] | None, FieldInfo(annotation=NoneType, required=True, description='Total system charges [1]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
node_attrs (Annotated[Float[Tensor, 'V A'] | None, FieldInfo(annotation=NoneType, required=True, description='Node attributes [n_nodes, n_node_attrs]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
node_alpha_spins (Annotated[Float[Tensor, 'V 1'] | None, FieldInfo(annotation=NoneType, required=True, description='Alpha spins for each atom, [n_nodes, 1]. Use this field for closed-shell spins.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
node_beta_spins (Annotated[Float[Tensor, 'V 1'] | None, FieldInfo(annotation=NoneType, required=True, description='Beta spins for each atom, [n_nodes, 1]. For restricted spin, use ``node_alpha_spins`` instead.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
graph_spins (Annotated[Float[Tensor, 'B 1'] | None, FieldInfo(annotation=NoneType, required=True, description='Spin or multiplicity value for the system, [1, 1]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
graph_alpha_spins (Annotated[Float[Tensor, 'B 1'] | None, FieldInfo(annotation=NoneType, required=True, description='Alpha spins for the entire graph, [1, 1]'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
node_embeddings (Annotated[Float[Tensor, 'V H'] | None, FieldInfo(annotation=NoneType, required=True, description='Embeddings for each node within the batch/graph.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
edge_embeddings (Annotated[Float[Tensor, 'E H'] | None, FieldInfo(annotation=NoneType, required=True, description='Embeddings for each edge within the batch/graph.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
graph_embeddings (Annotated[Float[Tensor, 'B H'] | None, FieldInfo(annotation=NoneType, required=True, description='Embeddings for the entire graph/graphs within a batch.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
velocities (Annotated[Float[Tensor, 'V 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Atomic velocities [n_nodes, 3], in units set by positions.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
momenta (Annotated[Float[Tensor, 'V 3'] | None, FieldInfo(annotation=NoneType, required=True, description='Atomic momenta [n_nodes, 3], in units set by positions.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
kinetic_energies (Annotated[Float[Tensor, 'V 1'] | None, FieldInfo(annotation=NoneType, required=True, description='Per-atom kinetic energies [n_nodes, 1], with the same units as energies.'), PlainSerializer(func=~nvalchemi.data.atomic_data._tensor_serialization, return_type=PydanticUndefined, when_used=json)])
info (dict[str, Tensor])
extra_data (Any)
- atomic_numbers#
Atomic numbers of each atom [n_nodes]
- Type:
torch.Tensor
- positions#
Cartesian coordinates [n_nodes, 3]
- Type:
torch.Tensor
- atomic_masses#
Atomic masses [n_nodes]
- Type:
torch.Tensor
- edge_index#
Edge index [2, n_edges]
- Type:
torch.Tensor
- node_attrs#
Node attributes [n_nodes, n_node_feats]
- Type:
torch.Tensor
- shifts#
Shifts for each edge [n_edges, 3]
- Type:
torch.Tensor
- unit_shifts#
Additional shifts for each edge [n_edges, 3]
- Type:
torch.Tensor
- cell#
Unit cell vectors [3, 3]
- Type:
torch.Tensor
- pbc#
Periodic boundary conditions [3]
- Type:
torch.Tensor
- forces#
Atomic forces [n_nodes, 3]
- Type:
torch.Tensor
- energies#
Total energies [1]
- Type:
torch.Tensor
- stresses#
Stress tensor [1, 3, 3]
- Type:
torch.Tensor
- virials#
Virial tensor [1, 3, 3]
- Type:
torch.Tensor
- dipoles#
Dipole moment [1, 3]
- Type:
torch.Tensor
- node_charges#
Partial atomic charges [n_nodes]
- Type:
torch.Tensor
- graph_charges#
Total system charge [1]
- Type:
torch.Tensor
- info#
Additional information about the system
- Type:
dict
- add_edge_property(key, value)[source]#
Add an edge property to the graph.
- Parameters:
key (str)
value (Any)
- Return type:
None
- add_node_property(key, value, node_dim=0)[source]#
Add a node property to the graph.
- Parameters:
key (str)
value (Tensor)
node_dim (int)
- Return type:
None
- add_system_property(key, value)[source]#
Add a system property to the graph.
- Parameters:
key (str)
value (Any)
- Return type:
None
- check_edge_consistency()[source]#
Validate that all edge-level properties have consistent atom counts.
This validator runs after all field validators and checks that any edge-level property that is set has the same number of edges as edge_index.
- Returns:
Returns self if validation passes.
- Return type:
Self
- Raises:
ValueError – If any edge-level property has an inconsistent number of edges.
- check_fp_dtype_consistency()[source]#
Ensures all floating point tensors are at the same precision as the positions tensor.
- Return type:
- check_node_consistency()[source]#
Validate that all node-level properties have consistent atom counts.
This validator runs after all field validators and checks that any node-level property that is set has the same number of nodes as atomic_numbers.
- Returns:
Returns self if validation passes.
- Return type:
Self
- Raises:
ValueError – If any node-level property has an inconsistent number of nodes.
- property chemical_hash: str#
Generate a unique hash for the chemical system using the blake2s hashing algorithm.
The hash is unique to a given atomic composition and structure, invariant to the ordering of atoms in the data. The hash also differentiates between periodic and non-periodic systems, and for the former, lattice vectors and directions of periodicity.
- Returns:
A
blake2shash string representing the chemical system.- Return type:
str
Notes
The hash is generated by: 1. Sorting atoms by atomic number to ensure invariance to atom ordering 2. Including atomic numbers and positions of sorted atoms 3. Including periodic boundary conditions and cell parameters if present 4. Computing a BLAKE2s hash of the formatted string representation
- property device: device#
Get the device of the positions tensor.
- property dtype: dtype#
Get the dtype of the positions tensor.
- property edge_properties: dict[str, Any]#
Get the edge properties of the graph.
- enforce_device_consistency()[source]#
Enforces all tensors to be on the same device.
In instances where the devices of atomic numbers and positions are different, we will try and promote them to offload over host CPU.
- Return type:
- classmethod from_atoms(atoms, energy_key='energy', forces_key='forces', stress_key='stress', virials_key='virials', dipole_key='dipole', charges_key='charges', device='cpu', dtype=torch.float32, z_table=None)[source]#
Creates AtomicData from a data structure.
- Parameters:
atoms (Any) – The data structure to convert to AtomicData.
energy_key (str) – The key to get the energy from the data structure.
forces_key (str) – The key to get the forces from the data structure.
stress_key (str) – The key to get the stress from the data structure.
virials_key (str) – The key to get the virials from the data structure.
dipole_key (str) – The key to get the dipole from the data structure.
charges_key (str) – The key to get the charges from the data structure.
device (str | torch.device) – The device to convert the data to.
dtype (torch.dtype) – The dtype to convert the data to.
z_table (AtomicNumberTable | None) – The atomic number table to use for the atomic numbers.
- Return type:
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'validate_assignment': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- property node_properties: dict[str, Any]#
Get the node properties of the graph.
- property num_edges: int#
Return the number of edges in the graph.
- property num_nodes: int#
Return the number of nodes in the graph.
- property system_properties: dict[str, Any]#
Get the system properties of the graph.
- use_default_categories()[source]#
Check to make sure categories for atoms are set.
In the case that a list is passed, which should be validated by
pydantic, we will convert it to a tensor.- Return type: