nvalchemi.data.AtomicDataZarrReader#
- class nvalchemi.data.AtomicDataZarrReader(store, *, pin_memory=False, include_index_in_metadata=True)[source]#
Reader for loading AtomicData from Zarr stores.
This reader provides random-access loading of AtomicData samples from Zarr stores created by
AtomicDataZarrWriter. It supports soft-deleted samples via the samples_mask and provides efficient random access using pointer arrays.- The Zarr store layout expected is:
dataset.zarr/ ├── meta/ # Pointer arrays + masks │ ├── atoms_ptr # int64 [N+1] — cumulative node counts │ ├── edges_ptr # int64 [N+1] — cumulative edge counts │ └── samples_mask # bool [N] — False = deleted sample │ ├── core/ # AtomicData fields │ ├── atomic_numbers # int64 [V_total] │ ├── positions # float32 [V_total, 3] │ └── … │ └── custom/ # User-defined arrays (optional)
- Parameters:
store (StoreLike) – Any zarr-compatible store: filesystem path (str or Path), or a zarr Store instance (LocalStore, MemoryStore, FsspecStore, etc.), StorePath, or a dict for in-memory buffer storage.
pin_memory (bool, default=False) – If True, place tensors in pinned (page-locked) memory for faster async CPU→GPU transfers.
include_index_in_metadata (bool, default=True) – If True, include sample index in the metadata dict.
- _store#
The underlying zarr store reference.
- Type:
StoreLike
Examples
>>> from nvalchemi.data.datapipes.backends.zarr import AtomicDataZarrReader >>> reader = AtomicDataZarrReader(store="dataset.zarr") >>> data_dict, metadata = reader[0] # returns dict and metadata >>> atomic_data = AtomicDataZarrReader.to_atomic_data(data_dict)
- refresh()[source]#
Reload cached pointer arrays, masks, and metadata from the store.
Call this method after external modifications to the Zarr store (e.g., appending or deleting samples via
AtomicDataZarrWriter) to ensure the reader reflects the current state of the data.- Raises:
RuntimeError – If the reader has been closed.
- Return type:
None