Earth2Studio is now OSS!

earth2studio.data: Data Sources#

Data sources used for downloading, caching and reading different weather / climate data APIs into Xarray data arrays. Used for fetching initial conditions for inference and validation data for scoring.

Warning

Each data source provided in Earth2Studio may have its own respective license. We encourage users to familiarize themselves with each and the limitations it may impose on their use case.

data.ARCO([cache, verbose])

Analysis-Ready, Cloud Optimized (ARCO) is a data store of ERA5 re-analysis data currated by Google.

data.CDS([cache, verbose])

The climate data source (CDS) serving ERA5 re-analysis data.

data.GFS([cache, verbose])

The global forecast service (GFS) initial state data source provided on an equirectangular grid.

data.HRRR([cache, verbose])

High-Resolution Rapid Refresh (HRRR) data source provides hourly North-American weather analysis data developed by NOAA (used to initialize the HRRR forecast model).

data.IFS([cache, verbose])

The integrated forecast system (IFS) initial state data source provided on an equirectangular grid.

data.IMERG([auth, cache, verbose])

The Integrated Multi-satellitE Retrievals (IMERG) for GPM.

data.Random(domain_coords)

A randomly generated normally distributed data.

data.WB2ERA5([cache, verbose])

ERA5 reanalysis data with several derived variables on a 0.25 degree lat-lon grid from 1959 to 2023 (incl) to 6 hour intervals on 13 pressure levels.

data.WB2ERA5_121x240([cache, verbose])

ERA5 reanalysis data with several derived variables down sampled to a 1.5 degree lat-lon grid from 1959 to 2023 (incl) to 6 hour intervals on 13 pressure levels.

data.WB2ERA5_32x64([cache, verbose])

ERA5 reanalysis data with several derived variables down sampled to a 5.625 degree lat-lon grid from 1959 to 2023 (incl) to 6 hour intervals on 13 pressure levels.

data.WB2Climatology([...])

Climatology provided by WeatherBench2,

data.DataArrayFile(file_path, **xr_args)

A local xarray dataarray file data source.

data.DataSetFile(file_path, array_name, ...)

A local xarray dataset file data source.

Forecast Sources#

Extended data sources that allow users to download forecast data, these are not interchangable with standard data sources. Typically used in intercomparison workflows.

data.GFS_FX([cache, verbose])

The global forecast service (GFS) forecast source provided on an equirectangular grid.

data.GEFS_FX([product, cache, verbose])

The Global Ensemble Forecast System (GEFS) forecast source is a 30 member ensemble forecast provided on an 0.5 degree equirectangular grid.

data.GEFS_FX_721x1440([product, cache, verbose])

The Global Ensemble Forecast System (GEFS) forecast source is a 30 member ensemble forecast provided on an 0.25 degree equirectangular grid.

data.HRRR_FX([cache, verbose])

High-Resolution Rapid Refresh (HRRR) forecast source provides a North-American weather forecasts with hourly forecast runs developed by NOAA.

Functions#

data.datasource_to_file(file_name, source, ...)

Utility function that can be used for building a local data store needed for an inference request.

data.fetch_data(source, time, variable[, ...])

Utility function to fetch data for models and load data on the target device.

data.prep_data_array(da[, device])

Prepares a data array from a data source for inference workflows by converting the data array to a torch tensor and the coordinate system to an OrderedDict.