NomadsGDASObsConv#

class earth2studio.data.NomadsGDASObsConv(
time_tolerance=numpy.timedelta64(10, 'm'),
max_workers=4,
decode_workers=8,
cache=True,
verbose=True,
async_timeout=600,
retries=3,
)[source]#
Global

Real-time GDAS conventional observation data from NOAA NOMADS PrepBUFR.

Provides near-real-time access to quality-controlled conventional (in-situ) observations from the NOAA Global Data Assimilation System (GDAS). Data is sourced from PrepBUFR files on NOMADS, updated 4 times daily (00z, 06z, 12z, 18z) with approximately 6-10 hours latency.

Observation types include radiosondes (ADPUPA), surface stations (ADPSFC), aircraft (AIRCAR/AIRCFT), ships and buoys (SFCSHP), wind profilers (PROFLR), satellite-derived winds (SATWND), and GPS precipitable water (GPSIPW).

The output schema matches UFSObsConv with the addition of a quality field containing the PrepBUFR quality control marker.

Parameters:
  • time_tolerance (TimeTolerance, optional) – Time tolerance window for filtering observations. Accepts a single value (symmetric +/- window) or a tuple (lower, upper) for asymmetric windows, by default np.timedelta64(10, “m”).

  • max_workers (int, optional) – Maximum concurrent async download tasks, by default 4.

  • decode_workers (int, optional) – Number of parallel processes for BUFR message decoding. Higher values speed up decoding of large PrepBUFR files at the cost of more memory. Set to 1 to disable multiprocessing, by default 8.

  • cache (bool, optional) – Cache downloaded PrepBUFR files locally, by default True.

  • verbose (bool, optional) – Print download progress, by default True.

  • async_timeout (int, optional) – Total timeout in seconds for the entire fetch, by default 600.

  • retries (int, optional) – Number of retry attempts per failed download with exponential backoff, by default 3.

Warning

This is a remote data source and can potentially download a large amount of data to your local machine for large requests. Each 6-hourly PrepBUFR file is approximately 60-70 MB.

Note

Additional information on the data:

Data is retained on the NOMADS production server for approximately 2 days. Older data should be retrieved from the UFS GEFSv13 Replay dataset via UFSObsConv.

__call__(time, variable, fields=None)[source]#

Fetch conventional observation data.

Parameters:
  • time (datetime | list[datetime] | TimeArray) – Timestamps to return data for (UTC).

  • variable (str | list[str] | VariableArray) – Variables to return. Must be in GDASObsConvLexicon.

  • fields (str | list[str] | pa.Schema | None, optional) – Schema fields to include in output. None returns all fields.

Returns:

Observation data matching the requested time/variable window.

Return type:

pd.DataFrame

Raises:
  • KeyError – If a variable is not found in the lexicon.

  • ValueError – If requested time is out of valid range.

async fetch(time, variable, fields=None)[source]#

Async fetch of conventional observation data.

Parameters:
  • time (datetime | list[datetime] | TimeArray) – Timestamps to return data for (UTC).

  • variable (str | list[str] | VariableArray) – Variables to return.

  • fields (str | list[str] | pa.Schema | None, optional) – Schema fields to include in output.

Returns:

Observation data.

Return type:

pd.DataFrame

classmethod available(time)[source]#

Check if date time is available on NOMADS.

Parameters:

time (datetime | np.datetime64) – Date time to check.

Returns:

True if the time falls within the NOMADS retention window.

Return type:

bool