earth2studio.data.ISD#

class earth2studio.data.ISD(stations, tolerance=numpy.timedelta64(0), cache=True, verbose=True, async_timeout=600)[source]#

NOAA’s Integrated Surface Database (ISD) is a global database that consists of hourly and synoptic surface observations compiled from numerous sources into a common data model.

Parameters:
  • stations (list[str]) – Station IDs as the concatenation of USAF (6 chars) and WBAN (5 digits) to attempt to fetch data from.

  • tolerance (timedelta | np.timedelta64, optional) – Time tolerance; nearest row within +/- tolerance is used per request, by default np.timedelta64(0)

  • cache (bool, optional) – Cache data source on local memory, by default True

  • verbose (bool, optional) – Print download progress and missing data warnings, by default True

  • async_timeout (int, optional) – Time in sec after which download will be cancelled if not finished successfully, by default 600

Warning

This is a remote data source and can potentially download a large amount of data to your local machine for large requests.

Note

To help get a list of possible station IDs, this class includes ISD.get_stations_bbox() which accepts a lat-lon bounding box and will return known historical stations IDs. For more information on the stations, users should consult the isd-history.csv which can easily accessed with ISD.get_station_history()

Example

# Bay area, lat lon bounding box (lat min, lon min, lat max, lon max)
stations = ISD.get_stations_bbox((36, -124, 40, -120))
ds = ISD(stations, tolerance=timedelta(hours=2))
df = ds(datetime(2024, 1, 1, 20), ["station", "time", "lat", "lon", "t2m"])
__call__(time, variable)[source]#

Function to get data

Parameters:
  • time (datetime | list[datetime] | TimeArray) – Timestamps to return data for (UTC).

  • variable (str | list[str] | VariableArray) – String, list of strings or array of strings that refer to variables to return. Must be in the ISD lexicon.

Returns:

ISD data frame

Return type:

pd.DataFrame

async fetch(time, variable)[source]#

Async function to get data

Parameters:
  • time (datetime | list[datetime] | TimeArray) – Timestamps to return data for (UTC).

  • variable (str | list[str] | VariableArray) – String, list of strings or array of strings that refer to variables (column ids) to return. Must be in the ISD lexicon.

Returns:

ISD data frame

Return type:

pd.DataFrame