zarr_writer#

Zarr writer sink for xarray DataArrays.

Writes incoming xarray.DataArray objects to a Zarr store, creating one Zarr group per variable with dimensions (time, lat, lon). Supports user-specified chunking and Zarr v3 sharding.

Classes#

ZarrSink

Write xarray.DataArray fields to a Zarr store.

Module Contents#

class physicsnemo_curator.domains.da.sinks.zarr_writer.ZarrSink(
output_path: str,
chunks: dict[str, int] | None = None,
shards: dict[str, int] | None = None,
)#

Bases: physicsnemo_curator.core.base.Sink[xarray.DataArray]

Write xarray.DataArray fields to a Zarr store.

Each incoming DataArray is expected to carry coordinate metadata (e.g. time, variable, lat, lon). The sink uses these coordinates — not the pipeline index — to organise the output.

DataArrays with a variable dimension are split along it so that each variable gets its own Zarr group: <output_path>/<variable_name>/, with dimensions (time, lat, lon). Subsequent calls append along the time dimension, so the sink accumulates data across pipeline indices based on the time coordinate in the incoming data.

Parameters:
  • output_path (str) – Path to the output Zarr store directory.

  • chunks (dict[str, int] | None) – Chunk sizes per dimension for the Zarr arrays. Defaults to {"time": 1, "lat": 721, "lon": 1440} (one time-step per chunk, full spatial extent).

  • shards (dict[str, int] | None) – Shard sizes per dimension (Zarr v3 only). When provided, each shard is a container for multiple chunks. Requires zarr>=3.0. If None, sharding is not used.

Examples

>>> sink = ZarrSink(
...     output_path="output.zarr",
...     chunks={"time": 1, "lat": 721, "lon": 1440},
... )
classmethod params() list[physicsnemo_curator.core.base.Param]#

Return parameter descriptors for the Zarr sink.

Returns:

Descriptors for output_path and chunks.

Return type:

list[Param]

description: ClassVar[str] = 'Write DataArrays to a Zarr store with configurable chunking and sharding'#

Short description shown in the interactive CLI.

name: ClassVar[str] = 'Zarr Writer'#

Human-readable display name for the interactive CLI.

property output_path: pathlib.Path#

Return the output Zarr store path.

property zarr_version: int#

Return the Zarr format version in use.