nvalchemi.dynamics.ZarrData#

class nvalchemi.dynamics.ZarrData(store, capacity=1_000_000)[source]#

Zarr-backed storage for batched atomic data.

This sink persists atomic data using the Zarr format, supporting both local filesystem and remote/in-memory stores via StoreLike. Delegates serialization to AtomicDataZarrWriter for efficient, amortized I/O with CSR-style pointer arrays.

Supports any zarr-compatible store: filesystem paths (str or Path), zarr Store instances (LocalStore, MemoryStore, FsspecStore for remote storage like S3/GCS), StorePath, or dict for in-memory buffers.

Parameters:
  • store (StoreLike) – Any zarr-compatible store: filesystem path (str or Path), zarr Store instance, StorePath, or dict for in-memory buffer storage.

  • capacity (int, optional) – Maximum number of samples to store. Default is 1,000,000.

capacity#

Maximum storage capacity.

Type:

int

store#

The backing zarr store.

Type:

StoreLike

Examples

>>> zarr_sink = ZarrData("/path/to/store", capacity=100000)
>>> zarr_sink.write(batch)
>>> loaded_batch = zarr_sink.read()

Using an in-memory store:

>>> zarr_sink = ZarrData({}, capacity=1000)  # dict acts as memory store
__init__(store, capacity=1_000_000)[source]#

Initialize the Zarr data sink.

Parameters:
  • store (StoreLike) – Any zarr-compatible store: filesystem path (str or Path), zarr Store instance, StorePath, or dict for in-memory buffer storage.

  • capacity (int, optional) – Maximum number of samples to store. Default is 1,000,000.

Return type:

None

Methods

__init__(store[, capacity])

Initialize the Zarr data sink.

drain()

Read all stored samples and clear the sink.

read()

Load all stored data from Zarr as a Batch.

write(batch[, mask])

Store a batch of atomic data to Zarr.

zero()

Clear all stored data and reset the store.

Attributes

capacity

Return the maximum storage capacity.

global_rank

Return the global rank of this data sink.

is_full

Check if the buffer has reached capacity.

local_rank

Return the local rank of this data sink.