earth2studio.data.CMIP6MultiRealm#

class earth2studio.data.CMIP6MultiRealm(cmip6_source_list)[source]#

CMIP6 data source for Earth2Studio with multiple realms.

This class allows combining multiple CMIP6 data sources from different realms (e.g., atmosphere, ocean, sea ice) into a single unified interface. Variables are fetched from each source in the order provided, and data on different grids are automatically regridded to a common regular lat/lon grid.

Parameters:

cmip6_source_list (list[CMIP6]) – List of CMIP6 data sources to combine. Variables will be fetched from sources in the order they appear in the list. All sources must have the same exact_time_match setting.

Raises:
  • ValueError – If cmip6_source_list is empty or if sources have different exact_time_match settings.

  • TypeError – If any item in cmip6_source_list is not a CMIP6 instance.

Note

When multiple sources have different grids, curvilinear grids (e.g., from ocean or sea ice models) will be interpolated to the first regular lat/lon grid found using nearest-neighbor interpolation.

All CMIP6 sources must be initialized with the same exact_time_match setting to ensure consistent time matching behavior across realms.

__call__(time, variable)[source]#

Retrieve data from multiple CMIP6 sources and combine into single array.

This method fetches the requested variables from the available CMIP6 sources, automatically regridding data from different grids to a common grid, and combines all variables into a single DataArray.

Parameters:
  • time (datetime | list[datetime] | TimeArray) – Timestamps to return data for.

  • variable (str | list[str] | VariableArray) – Variable(s) to retrieve. Each variable will be fetched from the first source in the list that has it available.

Returns:

Combined data array with dimensions (time, variable, lat, lon) or (time, variable, j, i) depending on the grid type.

Return type:

xr.DataArray

Raises:
  • ValueError – If any requested variables are not found in any of the sources.

  • NotImplementedError – If all sources use curvilinear grids (at least one regular lat/lon grid required).

Note

Variables are retrieved from sources in the order they appear in cmip6_source_list. If multiple sources contain the same variable, only the first one will be used.

Curvilinear grids (ocean/sea ice) are regridded to regular grids using nearest-neighbor interpolation to preserve data coverage near coastlines. At least one source with a regular lat/lon grid (typically atmospheric data) is required when combining multiple sources with different grids.

classmethod available(time, cmip6_source_list)[source]#

Check if the requested timestamp is available in all sources.

Parameters:
  • time (datetime | np.datetime64) – Timestamp to test (UTC).

  • cmip6_source_list (list[CMIP6]) – List of CMIP6 data sources to check.

Returns:

True if the timestamp is available in all sources, False otherwise.

Return type:

bool

Warning

This method may download data from ESGF servers for each source to check time availability. For multiple sources, this can result in significant data transfer.

Note

This method checks that ALL sources have data available at the requested time, since combining multi-realm data requires data from all sources. Each source is checked by downloading at least one file to verify the time coordinate range.