Note
Go to the end to download the full example code.
Statistical Inference#
This example will demonstrate how to run a simple inference workflow to generate a forecast and then to save a statistic of that data. There are a handful of built-in statistics available in earth2studio.statistics, but here we will demonstrate how to define a custom statistic and run inference.
In this example you will learn:
How to instantiate a built in prognostic model
Creating a data source and IO object
Create a custom statistic
Running a simple built in workflow
Post-processing results
Creating a statistical workflow#
Start with creating a simple inference workflow to use. We encourage
users to explore and experiment with their own custom workflows that borrow ideas from
built in workflows inside earth2studio.run
or the examples.
Creating our own generalizable workflow to use with statistics is easy when we rely on the component interfaces defined in Earth2Studio (use dependency injection). Here we create a run method that accepts the following:
time: Input list of datetimes / strings to run inference for
nsteps: Number of forecast steps to predict
prognostic: Our initialized prognostic model
statistic: our custom statistic
data: Initialized data source to fetch initial conditions from
io: IOBackend
We do not run an ensemble inference workflow here, even though it is common for statistical inference. See ensemble examples for details on how to extend this example for that purpose.
import os
os.makedirs("outputs", exist_ok=True)
from dotenv import load_dotenv
load_dotenv() # TODO: make common example prep function
from datetime import datetime
import numpy as np
import pandas as pd
from loguru import logger
from tqdm import tqdm
from earth2studio.data import DataSource, fetch_data
from earth2studio.io import IOBackend
from earth2studio.models.px import PrognosticModel
from earth2studio.statistics import Statistic
from earth2studio.utils.coords import map_coords
from earth2studio.utils.time import to_time_array
logger.remove()
logger.add(lambda msg: tqdm.write(msg, end=""), colorize=True)
def run_stats(
time: list[str] | list[datetime] | list[np.datetime64],
nsteps: int,
nensemble: int,
prognostic: PrognosticModel,
statistic: Statistic,
data: DataSource,
io: IOBackend,
) -> IOBackend:
"""Simple statistics workflow
Parameters
----------
time : list[str] | list[datetime] | list[np.datetime64]
List of string, datetimes or np.datetime64
nsteps : int
Number of forecast steps
nensemble : int
Number of ensemble members to run inference for.
prognostic : PrognosticModel
Prognostic models
statistic : Statistic
Custom statistic to compute and write to IO.
data : DataSource
Data source
io : IOBackend
IO object
Returns
-------
IOBackend
Output IO object
"""
logger.info("Running simple statistics workflow!")
# Load model onto the device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
logger.info(f"Inference device: {device}")
prognostic = prognostic.to(device)
# Fetch data from data source and load onto device
time = to_time_array(time)
x, coords = fetch_data(
source=data,
time=time,
lead_time=prognostic.input_coords()["lead_time"],
variable=prognostic.input_coords()["variable"],
device=device,
)
logger.success(f"Fetched data from {data.__class__.__name__}")
# Set up IO backend
total_coords = coords.copy()
output_coords = prognostic.output_coords(prognostic.input_coords())
total_coords["lead_time"] = np.asarray(
[output_coords["lead_time"] * i for i in range(nsteps + 1)]
).flatten()
# Remove reduced dimensions from statistic
for d in statistic.reduction_dimensions:
total_coords.pop(d, None)
io.add_array(total_coords, str(statistic))
# Map lat and lon if needed
x, coords = map_coords(x, coords, prognostic.input_coords())
# Create prognostic iterator
model = prognostic.create_iterator(x, coords)
logger.info("Inference starting!")
with tqdm(total=nsteps + 1, desc="Running inference") as pbar:
for step, (x, coords) in enumerate(model):
s, coords = statistic(x, coords)
io.write(s, coords, str(statistic))
pbar.update(1)
if step == nsteps:
break
logger.success("Inference complete")
return io
Set Up#
With the statistical workflow defined, we now need to create the individual components.
We need the following:
Prognostic Model: Use the built in Pangu 24 hour model
earth2studio.models.px.Pangu24
.statistic: We define our own statistic: the Southern Oscillation Index (SOI).
Datasource: Pull data from the GFS data api
earth2studio.data.GFS
.IO Backend: Save the outputs into a NetCDF4 store
earth2studio.io.NetCDF4Backend
.
from collections import OrderedDict
import numpy as np
import torch
from earth2studio.data import GFS
from earth2studio.io import NetCDF4Backend
from earth2studio.models.px import Pangu24
from earth2studio.utils.type import CoordSystem
# Load the default model package which downloads the check point from NGC
package = Pangu24.load_default_package()
model = Pangu24.load_model(package)
# Create the data source
data = GFS()
# Create the IO handler, store in memory
io = NetCDF4Backend(
file_name="outputs/soi.nc",
backend_kwargs={"mode": "w"},
)
# Create the custom statistic
class SOI:
"""Custom metric calculation the Southern Oscillation Index.
SOI = ( standardized_tahiti_slp - standardized_darwin_slp ) / soi_normalization
soi_normalization = std( historical ( standardized_tahiti_slp - standardized_darwin_slp ) )
standardized_*_slp = (*_slp - climatological_mean_*_slp) / climatological_std_*_slp
Note
----
__str__
Name that will be applied to the output of this statistic, primarily for IO purposes.
reduction_dimensions
Dimensions that this statistic reduces over. This is used to help automatically determine
the output coordinates, primarily used for IO purposes.
"""
def __str__(self) -> str:
return "soi"
def __init__(
self,
):
# Read in Tahiti and Darwin SLP data
from modulus.utils.filesystem import _download_cached
file_path = _download_cached(
"http://data.longpaddock.qld.gov.au/SeasonalClimateOutlook/SouthernOscillationIndex/SOIDataFiles/DailySOI1933-1992Base.txt"
)
ds = pd.read_csv(file_path, sep=r"\s+")
dates = pd.date_range("1999-01-01", freq="d", periods=len(ds))
ds["date"] = dates
ds = ds.set_index("date")
ds = ds.drop(["Year", "Day", "SOI"], axis=1)
ds = ds.rolling(30, min_periods=1).mean().dropna()
self.climatological_means = torch.tensor(
ds.groupby(ds.index.month).mean().to_numpy(), dtype=torch.float32
)
self.climatological_std = torch.tensor(
ds.groupby(ds.index.month).std().to_numpy(), dtype=torch.float32
)
standardized = ds.groupby(ds.index.month).transform(
lambda x: (x - x.mean()) / x.std()
)
diff = standardized["Tahiti"] - standardized["Darwin"]
self.normalization = torch.tensor(
diff.groupby(ds.index.month).std().to_numpy(), dtype=torch.float32
)
self.tahiti_coords = {
"variable": np.array(["msl"]),
"lat": np.array([-17.65]),
"lon": np.array([210.57]),
}
self.darwin_coords = {
"variable": np.array(["msl"]),
"lat": np.array([-12.46]),
"lon": np.array([130.84]),
}
self.reduction_dimensions = list(self.tahiti_coords)
def __call__(
self, x: torch.Tensor, coords: CoordSystem
) -> tuple[torch.Tensor, CoordSystem]:
"""Computes the SOI given an input.
coords must be a superset of both
tahiti_coords = {
'variable': np.array(['msl']),
'lat': np.array([-17.65]),
'lon': np.array([210.57])
}
and
darwin_coords = {
'variable': np.array(['msl']),
'lat': np.array([-12.46]),
'lon': np.array([130.84])
}
So make sure that the model chosen predicts the `msl` variable.
Parameters
----------
x : torch.Tensor
Input tensor
coords : CoordSystem
coordinate system belonging to the input tensor.
Returns
-------
tuple[torch.Tensor, CoordSystem]
Returns the SOI and appropriate coordinate system.
"""
tahiti, _ = map_coords(x, coords, self.tahiti_coords)
darwin, _ = map_coords(x, coords, self.darwin_coords)
tahiti = tahiti.squeeze(-3, -2, -1) / 100.0
darwin = darwin.squeeze(-3, -2, -1) / 100.0
output_coords = OrderedDict(
{k: v for k, v in coords.items() if k not in self.reduction_dimensions}
)
# Get time coordinates
times = coords["time"].reshape(-1, 1) + coords["lead_time"].reshape(1, -1)
months = torch.broadcast_to(
torch.as_tensor(
[pd.Timestamp(t).month for t in times.flatten()],
device=tahiti.device,
dtype=torch.int32,
).reshape(times.shape),
tahiti.shape,
)
cm = self.climatological_means.to(tahiti.device)
cs = self.climatological_std.to(tahiti.device)
norm = self.normalization.to(tahiti.device)
tahiti_std_anomaly = (tahiti - cm[months, 0]) / cs[months, 0]
darwin_std_anomaly = (tahiti - cm[months, 1]) / cs[months, 1]
return (tahiti_std_anomaly - darwin_std_anomaly) / norm[months], output_coords
soi = SOI()
Execute the Workflow#
With all components initialized, running the workflow is a single line of Python code. Workflow will return the provided IO object back to the user, which can be used to then post process. Some have additional APIs that can be handy for post-processing or saving to file. Check the API docs for more information. We simulate a trajectory of 60 time steps, or 2 months using Pangu24
nsteps = 60
nensemble = 1
io = run_stats(["2022-01-01"], nsteps, nensemble, model, soi, data, io)
2025-01-23 04:42:28.643 | INFO | __main__:run_stats:117 - Running simple statistics workflow!
2025-01-23 04:42:28.643 | INFO | __main__:run_stats:120 - Inference device: cuda
2025-01-23 04:42:51.995 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:209 - Fetching GFS index file: 2022-01-01 00:00:00 lead 0:00:00
Fetching GFS for 2022-01-01 00:00:00: 0%| | 0/69 [00:00<?, ?it/s]
2025-01-23 04:42:51.999 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: z1000 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 0%| | 0/69 [00:00<?, ?it/s]
2025-01-23 04:42:52.026 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: z925 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 0%| | 0/69 [00:00<?, ?it/s]
2025-01-23 04:42:52.053 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: z850 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 0%| | 0/69 [00:00<?, ?it/s]
2025-01-23 04:42:52.080 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: z700 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 0%| | 0/69 [00:00<?, ?it/s]
Fetching GFS for 2022-01-01 00:00:00: 6%|▌ | 4/69 [00:00<00:01, 36.93it/s]
2025-01-23 04:42:52.108 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: z600 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 6%|▌ | 4/69 [00:00<00:01, 36.93it/s]
2025-01-23 04:42:52.135 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: z500 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 6%|▌ | 4/69 [00:00<00:01, 36.93it/s]
2025-01-23 04:42:52.161 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: z400 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 6%|▌ | 4/69 [00:00<00:01, 36.93it/s]
2025-01-23 04:42:52.188 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: z300 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 6%|▌ | 4/69 [00:00<00:01, 36.93it/s]
Fetching GFS for 2022-01-01 00:00:00: 12%|█▏ | 8/69 [00:00<00:01, 37.10it/s]
2025-01-23 04:42:52.215 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: z250 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 12%|█▏ | 8/69 [00:00<00:01, 37.10it/s]
2025-01-23 04:42:52.242 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: z200 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 12%|█▏ | 8/69 [00:00<00:01, 37.10it/s]
2025-01-23 04:42:52.268 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: z150 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 12%|█▏ | 8/69 [00:00<00:01, 37.10it/s]
2025-01-23 04:42:52.295 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: z100 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 12%|█▏ | 8/69 [00:00<00:01, 37.10it/s]
Fetching GFS for 2022-01-01 00:00:00: 17%|█▋ | 12/69 [00:00<00:01, 36.85it/s]
2025-01-23 04:42:52.325 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: z50 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 17%|█▋ | 12/69 [00:00<00:01, 36.85it/s]
2025-01-23 04:42:52.354 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: q1000 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 17%|█▋ | 12/69 [00:00<00:01, 36.85it/s]
2025-01-23 04:42:52.382 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: q925 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 17%|█▋ | 12/69 [00:00<00:01, 36.85it/s]
2025-01-23 04:42:52.410 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: q850 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 17%|█▋ | 12/69 [00:00<00:01, 36.85it/s]
Fetching GFS for 2022-01-01 00:00:00: 23%|██▎ | 16/69 [00:00<00:01, 36.23it/s]
2025-01-23 04:42:52.438 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: q700 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 23%|██▎ | 16/69 [00:00<00:01, 36.23it/s]
2025-01-23 04:42:52.465 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: q600 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 23%|██▎ | 16/69 [00:00<00:01, 36.23it/s]
2025-01-23 04:42:52.492 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: q500 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 23%|██▎ | 16/69 [00:00<00:01, 36.23it/s]
2025-01-23 04:42:52.519 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: q400 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 23%|██▎ | 16/69 [00:00<00:01, 36.23it/s]
Fetching GFS for 2022-01-01 00:00:00: 29%|██▉ | 20/69 [00:00<00:01, 36.36it/s]
2025-01-23 04:42:52.547 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: q300 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 29%|██▉ | 20/69 [00:00<00:01, 36.36it/s]
2025-01-23 04:42:52.574 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: q250 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 29%|██▉ | 20/69 [00:00<00:01, 36.36it/s]
2025-01-23 04:42:52.601 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: q200 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 29%|██▉ | 20/69 [00:00<00:01, 36.36it/s]
2025-01-23 04:42:52.628 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: q150 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 29%|██▉ | 20/69 [00:00<00:01, 36.36it/s]
Fetching GFS for 2022-01-01 00:00:00: 35%|███▍ | 24/69 [00:00<00:01, 36.56it/s]
2025-01-23 04:42:52.655 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: q100 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 35%|███▍ | 24/69 [00:00<00:01, 36.56it/s]
2025-01-23 04:42:52.683 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: q50 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 35%|███▍ | 24/69 [00:00<00:01, 36.56it/s]
2025-01-23 04:42:52.712 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: t1000 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 35%|███▍ | 24/69 [00:00<00:01, 36.56it/s]
2025-01-23 04:42:52.739 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: t925 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 35%|███▍ | 24/69 [00:00<00:01, 36.56it/s]
Fetching GFS for 2022-01-01 00:00:00: 41%|████ | 28/69 [00:00<00:01, 36.46it/s]
2025-01-23 04:42:52.766 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: t850 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 41%|████ | 28/69 [00:00<00:01, 36.46it/s]
2025-01-23 04:42:52.793 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: t700 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 41%|████ | 28/69 [00:00<00:01, 36.46it/s]
2025-01-23 04:42:52.819 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: t600 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 41%|████ | 28/69 [00:00<00:01, 36.46it/s]
2025-01-23 04:42:52.846 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: t500 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 41%|████ | 28/69 [00:00<00:01, 36.46it/s]
Fetching GFS for 2022-01-01 00:00:00: 46%|████▋ | 32/69 [00:00<00:01, 36.72it/s]
2025-01-23 04:42:52.873 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: t400 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 46%|████▋ | 32/69 [00:00<00:01, 36.72it/s]
2025-01-23 04:42:52.900 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: t300 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 46%|████▋ | 32/69 [00:00<00:01, 36.72it/s]
2025-01-23 04:42:52.926 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: t250 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 46%|████▋ | 32/69 [00:00<00:01, 36.72it/s]
2025-01-23 04:42:52.953 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: t200 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 46%|████▋ | 32/69 [00:00<00:01, 36.72it/s]
Fetching GFS for 2022-01-01 00:00:00: 52%|█████▏ | 36/69 [00:00<00:00, 36.96it/s]
2025-01-23 04:42:52.980 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: t150 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 52%|█████▏ | 36/69 [00:00<00:00, 36.96it/s]
2025-01-23 04:42:53.006 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: t100 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 52%|█████▏ | 36/69 [00:01<00:00, 36.96it/s]
2025-01-23 04:42:53.032 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: t50 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 52%|█████▏ | 36/69 [00:01<00:00, 36.96it/s]
2025-01-23 04:42:53.059 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: u1000 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 52%|█████▏ | 36/69 [00:01<00:00, 36.96it/s]
Fetching GFS for 2022-01-01 00:00:00: 58%|█████▊ | 40/69 [00:01<00:00, 37.15it/s]
2025-01-23 04:42:53.086 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: u925 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 58%|█████▊ | 40/69 [00:01<00:00, 37.15it/s]
2025-01-23 04:42:53.113 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: u850 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 58%|█████▊ | 40/69 [00:01<00:00, 37.15it/s]
2025-01-23 04:42:53.140 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: u700 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 58%|█████▊ | 40/69 [00:01<00:00, 37.15it/s]
2025-01-23 04:42:53.167 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: u600 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 58%|█████▊ | 40/69 [00:01<00:00, 37.15it/s]
Fetching GFS for 2022-01-01 00:00:00: 64%|██████▍ | 44/69 [00:01<00:00, 37.09it/s]
2025-01-23 04:42:53.194 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: u500 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 64%|██████▍ | 44/69 [00:01<00:00, 37.09it/s]
2025-01-23 04:42:53.220 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: u400 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 64%|██████▍ | 44/69 [00:01<00:00, 37.09it/s]
2025-01-23 04:42:53.247 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: u300 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 64%|██████▍ | 44/69 [00:01<00:00, 37.09it/s]
2025-01-23 04:42:53.273 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: u250 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 64%|██████▍ | 44/69 [00:01<00:00, 37.09it/s]
Fetching GFS for 2022-01-01 00:00:00: 70%|██████▉ | 48/69 [00:01<00:00, 37.37it/s]
2025-01-23 04:42:53.300 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: u200 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 70%|██████▉ | 48/69 [00:01<00:00, 37.37it/s]
2025-01-23 04:42:53.325 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: u150 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 70%|██████▉ | 48/69 [00:01<00:00, 37.37it/s]
2025-01-23 04:42:53.353 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: u100 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 70%|██████▉ | 48/69 [00:01<00:00, 37.37it/s]
2025-01-23 04:42:53.380 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: u50 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 70%|██████▉ | 48/69 [00:01<00:00, 37.37it/s]
Fetching GFS for 2022-01-01 00:00:00: 75%|███████▌ | 52/69 [00:01<00:00, 37.34it/s]
2025-01-23 04:42:53.407 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: v1000 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 75%|███████▌ | 52/69 [00:01<00:00, 37.34it/s]
2025-01-23 04:42:53.434 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: v925 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 75%|███████▌ | 52/69 [00:01<00:00, 37.34it/s]
2025-01-23 04:42:53.461 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: v850 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 75%|███████▌ | 52/69 [00:01<00:00, 37.34it/s]
2025-01-23 04:42:53.488 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: v700 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 75%|███████▌ | 52/69 [00:01<00:00, 37.34it/s]
Fetching GFS for 2022-01-01 00:00:00: 81%|████████ | 56/69 [00:01<00:00, 37.24it/s]
2025-01-23 04:42:53.515 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: v600 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 81%|████████ | 56/69 [00:01<00:00, 37.24it/s]
2025-01-23 04:42:53.542 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: v500 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 81%|████████ | 56/69 [00:01<00:00, 37.24it/s]
2025-01-23 04:42:53.568 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: v400 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 81%|████████ | 56/69 [00:01<00:00, 37.24it/s]
2025-01-23 04:42:53.595 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: v300 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 81%|████████ | 56/69 [00:01<00:00, 37.24it/s]
Fetching GFS for 2022-01-01 00:00:00: 87%|████████▋ | 60/69 [00:01<00:00, 37.31it/s]
2025-01-23 04:42:53.622 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: v250 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 87%|████████▋ | 60/69 [00:01<00:00, 37.31it/s]
2025-01-23 04:42:53.648 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: v200 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 87%|████████▋ | 60/69 [00:01<00:00, 37.31it/s]
2025-01-23 04:42:53.674 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: v150 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 87%|████████▋ | 60/69 [00:01<00:00, 37.31it/s]
2025-01-23 04:42:53.701 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: v100 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 87%|████████▋ | 60/69 [00:01<00:00, 37.31it/s]
Fetching GFS for 2022-01-01 00:00:00: 93%|█████████▎| 64/69 [00:01<00:00, 37.39it/s]
2025-01-23 04:42:53.728 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: v50 at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 93%|█████████▎| 64/69 [00:01<00:00, 37.39it/s]
2025-01-23 04:42:53.754 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: msl at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 93%|█████████▎| 64/69 [00:01<00:00, 37.39it/s]
2025-01-23 04:42:53.781 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: u10m at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 93%|█████████▎| 64/69 [00:01<00:00, 37.39it/s]
2025-01-23 04:42:53.808 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: v10m at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 93%|█████████▎| 64/69 [00:01<00:00, 37.39it/s]
Fetching GFS for 2022-01-01 00:00:00: 99%|█████████▊| 68/69 [00:01<00:00, 37.42it/s]
2025-01-23 04:42:53.835 | DEBUG | earth2studio.data.gfs:_fetch_gfs_dataarray:255 - Fetching GFS grib file for variable: t2m at 2022-01-01 00:00:00_0:00:00
Fetching GFS for 2022-01-01 00:00:00: 99%|█████████▊| 68/69 [00:01<00:00, 37.42it/s]
Fetching GFS for 2022-01-01 00:00:00: 100%|██████████| 69/69 [00:01<00:00, 37.05it/s]
2025-01-23 04:42:54.135 | SUCCESS | __main__:run_stats:131 - Fetched data from GFS
2025-01-23 04:42:54.137 | INFO | __main__:run_stats:151 - Inference starting!
Running inference: 0%| | 0/61 [00:00<?, ?it/s]
Running inference: 3%|▎ | 2/61 [00:02<01:13, 1.24s/it]
Running inference: 5%|▍ | 3/61 [00:04<01:38, 1.70s/it]
Running inference: 7%|▋ | 4/61 [00:07<01:51, 1.95s/it]
Running inference: 8%|▊ | 5/61 [00:09<01:57, 2.09s/it]
Running inference: 10%|▉ | 6/61 [00:11<01:59, 2.18s/it]
Running inference: 11%|█▏ | 7/61 [00:14<02:00, 2.24s/it]
Running inference: 13%|█▎ | 8/61 [00:16<02:00, 2.28s/it]
Running inference: 15%|█▍ | 9/61 [00:18<01:59, 2.30s/it]
Running inference: 16%|█▋ | 10/61 [00:21<01:58, 2.32s/it]
Running inference: 18%|█▊ | 11/61 [00:23<01:56, 2.33s/it]
Running inference: 20%|█▉ | 12/61 [00:26<01:54, 2.34s/it]
Running inference: 21%|██▏ | 13/61 [00:28<01:52, 2.35s/it]
Running inference: 23%|██▎ | 14/61 [00:30<01:50, 2.35s/it]
Running inference: 25%|██▍ | 15/61 [00:33<01:48, 2.35s/it]
Running inference: 26%|██▌ | 16/61 [00:35<01:45, 2.35s/it]
Running inference: 28%|██▊ | 17/61 [00:37<01:43, 2.36s/it]
Running inference: 30%|██▉ | 18/61 [00:40<01:41, 2.36s/it]
Running inference: 31%|███ | 19/61 [00:42<01:39, 2.36s/it]
Running inference: 33%|███▎ | 20/61 [00:44<01:36, 2.36s/it]
Running inference: 34%|███▍ | 21/61 [00:47<01:34, 2.36s/it]
Running inference: 36%|███▌ | 22/61 [00:49<01:32, 2.36s/it]
Running inference: 38%|███▊ | 23/61 [00:52<01:29, 2.36s/it]
Running inference: 39%|███▉ | 24/61 [00:54<01:27, 2.36s/it]
Running inference: 41%|████ | 25/61 [00:56<01:25, 2.36s/it]
Running inference: 43%|████▎ | 26/61 [00:59<01:22, 2.36s/it]
Running inference: 44%|████▍ | 27/61 [01:01<01:20, 2.36s/it]
Running inference: 46%|████▌ | 28/61 [01:03<01:18, 2.36s/it]
Running inference: 48%|████▊ | 29/61 [01:06<01:15, 2.36s/it]
Running inference: 49%|████▉ | 30/61 [01:08<01:13, 2.36s/it]
Running inference: 51%|█████ | 31/61 [01:10<01:10, 2.37s/it]
Running inference: 52%|█████▏ | 32/61 [01:13<01:08, 2.36s/it]
Running inference: 54%|█████▍ | 33/61 [01:15<01:06, 2.37s/it]
Running inference: 56%|█████▌ | 34/61 [01:18<01:03, 2.37s/it]
Running inference: 57%|█████▋ | 35/61 [01:20<01:01, 2.37s/it]
Running inference: 59%|█████▉ | 36/61 [01:22<00:59, 2.37s/it]
Running inference: 61%|██████ | 37/61 [01:25<00:56, 2.37s/it]
Running inference: 62%|██████▏ | 38/61 [01:27<00:54, 2.37s/it]
Running inference: 64%|██████▍ | 39/61 [01:29<00:52, 2.37s/it]
Running inference: 66%|██████▌ | 40/61 [01:32<00:49, 2.37s/it]
Running inference: 67%|██████▋ | 41/61 [01:34<00:47, 2.37s/it]
Running inference: 69%|██████▉ | 42/61 [01:36<00:45, 2.37s/it]
Running inference: 70%|███████ | 43/61 [01:39<00:42, 2.37s/it]
Running inference: 72%|███████▏ | 44/61 [01:41<00:40, 2.37s/it]
Running inference: 74%|███████▍ | 45/61 [01:44<00:37, 2.37s/it]
Running inference: 75%|███████▌ | 46/61 [01:46<00:35, 2.37s/it]
Running inference: 77%|███████▋ | 47/61 [01:48<00:33, 2.37s/it]
Running inference: 79%|███████▊ | 48/61 [01:51<00:30, 2.37s/it]
Running inference: 80%|████████ | 49/61 [01:53<00:28, 2.37s/it]
Running inference: 82%|████████▏ | 50/61 [01:55<00:26, 2.37s/it]
Running inference: 84%|████████▎ | 51/61 [01:58<00:23, 2.37s/it]
Running inference: 85%|████████▌ | 52/61 [02:00<00:21, 2.37s/it]
Running inference: 87%|████████▋ | 53/61 [02:03<00:18, 2.37s/it]
Running inference: 89%|████████▊ | 54/61 [02:05<00:16, 2.37s/it]
Running inference: 90%|█████████ | 55/61 [02:07<00:14, 2.37s/it]
Running inference: 92%|█████████▏| 56/61 [02:10<00:11, 2.37s/it]
Running inference: 93%|█████████▎| 57/61 [02:12<00:09, 2.37s/it]
Running inference: 95%|█████████▌| 58/61 [02:14<00:07, 2.37s/it]
Running inference: 97%|█████████▋| 59/61 [02:17<00:04, 2.37s/it]
Running inference: 98%|█████████▊| 60/61 [02:19<00:02, 2.37s/it]
Running inference: 100%|██████████| 61/61 [02:22<00:00, 2.37s/it]
Running inference: 100%|██████████| 61/61 [02:22<00:00, 2.33s/it]
2025-01-23 04:45:16.187 | SUCCESS | __main__:run_stats:160 - Inference complete
Post Processing#
The last step is to post process our results.
Notice that the NetCDF IO function has additional APIs to interact with the stored data.
import matplotlib.pyplot as plt
times = io["time"][:].flatten() + io["lead_time"][:].flatten()
fig = plt.figure(figsize=(12, 4))
ax = fig.add_subplot(1, 1, 1)
ax.plot(times, io["soi"][:].flatten())
ax.set_title("Southern Oscillation Index")
ax.grid("on")
plt.savefig("outputs/southern_oscillation_index_prediction_2022.png")
io.close()
Total running time of the script: (3 minutes 12.410 seconds)