era5#

ERA5 reanalysis source with multi-backend support.

Fetches ERA5 data from one or more earth2studio backends (ARCO, WB2, NCAR, CDS). Each requested variable is routed to the highest-priority backend whose lexicon contains it. When variables span multiple backends, results are fetched separately and merged along the variable dimension.

Each pipeline index corresponds to a single timestamp, and the returned xarray.DataArray has dimensions (time, variable, lat, lon) with a single time step.

Classes#

ERA5Source

Fetch ERA5 reanalysis fields from earth2studio backends.

Module Contents#

class physicsnemo_curator.domains.da.sources.era5.ERA5Source(
times: list[datetime.datetime],
variables: list[str],
*,
backend: str | list[str] = 'arco',
backend_options: dict[str, dict[str, Any]] | None = None,
cache: bool = True,
)#

Bases: physicsnemo_curator.core.base.Source[xarray.DataArray]

Fetch ERA5 reanalysis fields from earth2studio backends.

Supports four backends — ARCO, WB2, NCAR, and CDS — with automatic per-variable routing. Each variable is assigned to the highest-priority backend whose lexicon contains it.

Parameters:
  • times (list[datetime]) – Timestamps to fetch. Must be within the range of the selected backend(s).

  • variables (list[str]) – Earth2studio variable identifiers (e.g. "t2m", "z500").

  • backend (str | list[str]) – Backend name or priority-ordered list. Valid names: "arco", "wb2", "ncar", "cds". Default "arco" preserves backward compatibility.

  • backend_options (dict[str, dict[str, Any]] | None) – Per-backend keyword arguments forwarded to the constructor. Example: {"ncar": {"max_workers": 8}}. The cache and verbose parameters are set automatically.

  • cache (bool) – Whether to cache downloaded chunks locally (default True).

Examples

>>> from datetime import datetime
>>> source = ERA5Source(
...     times=[datetime(2020, 6, 1, 0)],
...     variables=["t2m", "u10m"],
... )
>>> len(source)
1

Multi-backend with fallback:

>>> source = ERA5Source(
...     times=[datetime(2020, 6, 1, 0)],
...     variables=["t2m", "cp"],
...     backend=["arco", "ncar"],
... )
>>> source.variable_routing
{'t2m': 'arco', 'cp': 'ncar'}

Note

classmethod params() list[physicsnemo_curator.core.base.Param]#

Return parameter descriptors for the ERA5 source.

Returns:

Descriptors for times, variables, backend, and cache.

Return type:

list[Param]

property active_backend: str | None#

Return the single backend name if all variables use one backend.

Returns None if variables are split across multiple backends.

Returns:

Backend name or None.

Return type:

str | None

property backends_used: set[str]#

Return set of backend names that have variables routed to them.

description: ClassVar[str] = 'ERA5 reanalysis via earth2studio (ARCO, WB2, NCAR, CDS)'#

Short description shown in the interactive CLI.

name: ClassVar[str] = 'ERA5'#

Human-readable display name for the interactive CLI.

property times: list[datetime.datetime]#

Return the list of timestamps in this source.

property variable_routing: dict[str, str]#

Return mapping of variable name to backend name.

property variables: list[str]#

Return the list of variable IDs in this source.