NomadsGDASObsConv#
- class earth2studio.data.NomadsGDASObsConv(
- time_tolerance=numpy.timedelta64(10, 'm'),
- max_workers=4,
- decode_workers=8,
- cache=True,
- verbose=True,
- async_timeout=600,
- retries=3,
- Global
Real-time GDAS conventional observation data from NOAA NOMADS PrepBUFR.
Provides near-real-time access to quality-controlled conventional (in-situ) observations from the NOAA Global Data Assimilation System (GDAS). Data is sourced from PrepBUFR files on NOMADS, updated 4 times daily (00z, 06z, 12z, 18z) with approximately 6-10 hours latency.
Observation types include radiosondes (ADPUPA), surface stations (ADPSFC), aircraft (AIRCAR/AIRCFT), ships and buoys (SFCSHP), wind profilers (PROFLR), satellite-derived winds (SATWND), and GPS precipitable water (GPSIPW).
The output schema matches
UFSObsConvwith the addition of aqualityfield containing the PrepBUFR quality control marker.- Parameters:
time_tolerance (TimeTolerance, optional) – Time tolerance window for filtering observations. Accepts a single value (symmetric +/- window) or a tuple (lower, upper) for asymmetric windows, by default np.timedelta64(10, “m”).
max_workers (int, optional) – Maximum concurrent async download tasks, by default 4.
decode_workers (int, optional) – Number of parallel processes for BUFR message decoding. Higher values speed up decoding of large PrepBUFR files at the cost of more memory. Set to 1 to disable multiprocessing, by default 8.
cache (bool, optional) – Cache downloaded PrepBUFR files locally, by default True.
verbose (bool, optional) – Print download progress, by default True.
async_timeout (int, optional) – Total timeout in seconds for the entire fetch, by default 600.
retries (int, optional) – Number of retry attempts per failed download with exponential backoff, by default 3.
Warning
This is a remote data source and can potentially download a large amount of data to your local machine for large requests. Each 6-hourly PrepBUFR file is approximately 60-70 MB.
Note
Additional information on the data:
https://nomads.ncep.noaa.gov/pub/data/nccf/com/obsproc/prod/
https://www.emc.ncep.noaa.gov/mmb/data_processing/prepbufr.doc/document.htm
https://www.emc.ncep.noaa.gov/emc/pages/numerical_forecast_systems/gfs.php
Data is retained on the NOMADS production server for approximately 2 days. Older data should be retrieved from the UFS GEFSv13 Replay dataset via
UFSObsConv.- __call__(time, variable, fields=None)[source]#
Fetch conventional observation data.
- Parameters:
time (datetime | list[datetime] | TimeArray) – Timestamps to return data for (UTC).
variable (str | list[str] | VariableArray) – Variables to return. Must be in
GDASObsConvLexicon.fields (str | list[str] | pa.Schema | None, optional) – Schema fields to include in output. None returns all fields.
- Returns:
Observation data matching the requested time/variable window.
- Return type:
pd.DataFrame
- Raises:
KeyError – If a variable is not found in the lexicon.
ValueError – If requested time is out of valid range.
- async fetch(time, variable, fields=None)[source]#
Async fetch of conventional observation data.
- Parameters:
time (datetime | list[datetime] | TimeArray) – Timestamps to return data for (UTC).
variable (str | list[str] | VariableArray) – Variables to return.
fields (str | list[str] | pa.Schema | None, optional) – Schema fields to include in output.
- Returns:
Observation data.
- Return type:
pd.DataFrame