CBottle Data Generation and Infilling#

Climate in a Bottle (cBottle) inference workflows for global weather data synthesis.

This example will demonstrate the cBottle diffusion model data source and infilling diagnostic model for generating global climate and weather data. Both the cBottle data source and infilling diagnostic use the same diffusion model but the sampling procedure is different enabling two unique modes of interaction.

For more information on cBottle see:

In this example you will learn:

  • Generating synthetic climate data with cBottle data source

  • Instantiating cBottle infill diagnostic model

  • Creating a simple infilling inference workflow

Set Up#

For this example we will use the cBottle data source and infill diagnostic. Unlike other data sources the cBottle3D data source needs to be loaded similar to a prognostic or diagnostic model.

Thus, we need the following:

import os

os.makedirs("outputs", exist_ok=True)
from dotenv import load_dotenv

load_dotenv()  # TODO: make common example prep function

import torch

from earth2studio.data import WB2ERA5, CBottle3D
from earth2studio.models.dx import CBottleInfill

# Load the default model package which downloads the check point from NGC
package = CBottle3D.load_default_package()
cbottle_ds = CBottle3D.load_model(package)
# This is an AI data source, so also move it to device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
cbottle_ds = cbottle_ds.to(device)

# Create the ground truth data source
era5_ds = WB2ERA5()

Generating Synthetic Weather Data#

Once loaded, generating data from cBottle is as easy as any other data source. Under the hood the model is conditioned on the timestamp requested as well as a mid-month SST field which is internally handle for users but limits the range of the data source to years between 1970 and 2022.

Note that this diffusion model is stochastic, so querying the same timestamp will generate different fields that are reflective of the requested time and SST state. One can use set_seed for reproducibility.

from datetime import datetime

n_samples = 5
timestamp = datetime(2022, 9, 5)

# Fetch the ground truth
era5_da = era5_ds([timestamp], ["msl", "tcwv"])
# Generate some samples from cBottle
cbottle_da = cbottle_ds([timestamp for i in range(n_samples)], ["msl", "tcwv"])

print(era5_da)
print(cbottle_da)
Fetching WB2 data:   0%|          | 0/2 [00:00<?, ?it/s]

2025-06-16 19:22:24.752 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: msl at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/2 [00:00<?, ?it/s]

2025-06-16 19:22:24.753 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: tcwv at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/2 [00:00<?, ?it/s]
Fetching WB2 data: 100%|██████████| 2/2 [00:00<00:00, 194.22it/s]

Generating cBottle Data:   0%|          | 0/2 [00:00<?, ?it/s]
Generating cBottle Data:  50%|█████     | 1/2 [00:14<00:14, 14.82s/it]
Generating cBottle Data: 100%|██████████| 2/2 [00:17<00:00,  7.74s/it]
Generating cBottle Data: 100%|██████████| 2/2 [00:17<00:00,  8.80s/it]
<xarray.DataArray (time: 1, variable: 2, lat: 721, lon: 1440)> Size: 17MB
array([[[[1.01743422e+05, 1.01743422e+05, 1.01743422e+05, ...,
          1.01743422e+05, 1.01743422e+05, 1.01743422e+05],
         [1.01793445e+05, 1.01793289e+05, 1.01793289e+05, ...,
          1.01794055e+05, 1.01793750e+05, 1.01793750e+05],
         [1.01839055e+05, 1.01838445e+05, 1.01838297e+05, ...,
          1.01839969e+05, 1.01839508e+05, 1.01839203e+05],
         ...,
         [1.00018555e+05, 1.00018555e+05, 1.00018555e+05, ...,
          1.00017945e+05, 1.00017945e+05, 1.00018250e+05],
         [9.98792891e+04, 9.98789844e+04, 9.98789844e+04, ...,
          9.98786797e+04, 9.98789844e+04, 9.98789844e+04],
         [9.98712266e+04, 9.98712266e+04, 9.98712266e+04, ...,
          9.98712266e+04, 9.98712266e+04, 9.98712266e+04]],

        [[1.33166103e+01, 1.33166103e+01, 1.33166103e+01, ...,
          1.33166103e+01, 1.33166103e+01, 1.33166103e+01],
         [1.31231613e+01, 1.31231613e+01, 1.31231613e+01, ...,
          1.31190758e+01, 1.31217995e+01, 1.31217995e+01],
         [1.28479767e+01, 1.28479767e+01, 1.28493385e+01, ...,
          1.28357162e+01, 1.28384399e+01, 1.28425274e+01],
         ...,
         [2.75253296e-01, 2.75253296e-01, 2.75253296e-01, ...,
          2.75253296e-01, 2.75253296e-01, 2.75253296e-01],
         [2.75253296e-01, 2.75253296e-01, 2.75253296e-01, ...,
          2.75253296e-01, 2.75253296e-01, 2.75253296e-01],
         [2.73891449e-01, 2.73891449e-01, 2.73891449e-01, ...,
          2.73891449e-01, 2.73891449e-01, 2.73891449e-01]]]])
Coordinates:
  * time      (time) datetime64[ns] 8B 2022-09-05
  * variable  (variable) <U4 32B 'msl' 'tcwv'
  * lat       (lat) float64 6kB 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
  * lon       (lon) float64 12kB 0.0 0.25 0.5 0.75 ... 359.0 359.2 359.5 359.8
<xarray.DataArray (time: 5, variable: 2, lat: 721, lon: 1440)> Size: 83MB
array([[[[ 9.95229503e+04,  9.95229503e+04,  9.95229503e+04, ...,
           9.95229503e+04,  9.95229503e+04,  9.95229503e+04],
         [ 9.95467359e+04,  9.95467254e+04,  9.95467149e+04, ...,
           9.95467674e+04,  9.95467569e+04,  9.95467464e+04],
         [ 9.95705215e+04,  9.95705005e+04,  9.95704795e+04, ...,
           9.95705845e+04,  9.95705635e+04,  9.95705425e+04],
         ...,
         [ 1.01367648e+05,  1.01366192e+05,  1.01364737e+05, ...,
           1.01372014e+05,  1.01370558e+05,  1.01369103e+05],
         [ 1.01389386e+05,  1.01388659e+05,  1.01387931e+05, ...,
           1.01391569e+05,  1.01390842e+05,  1.01390114e+05],
         [ 1.01411125e+05,  1.01411125e+05,  1.01411125e+05, ...,
           1.01411125e+05,  1.01411125e+05,  1.01411125e+05]],

        [[ 5.19111664e+00,  5.19111664e+00,  5.19111664e+00, ...,
           5.19111664e+00,  5.19111664e+00,  5.19111664e+00],
         [ 5.01230668e+00,  5.00957147e+00,  5.00683627e+00, ...,
           5.02051230e+00,  5.01777709e+00,  5.01504189e+00],
         [ 4.83349673e+00,  4.82802631e+00,  4.82255590e+00, ...,
           4.84990796e+00,  4.84443755e+00,  4.83896714e+00],
...
         [ 1.01638628e+05,  1.01637768e+05,  1.01636907e+05, ...,
           1.01641208e+05,  1.01640348e+05,  1.01639488e+05],
         [ 1.01590151e+05,  1.01589721e+05,  1.01589291e+05, ...,
           1.01591442e+05,  1.01591012e+05,  1.01590581e+05],
         [ 1.01541675e+05,  1.01541675e+05,  1.01541675e+05, ...,
           1.01541675e+05,  1.01541675e+05,  1.01541675e+05]],

        [[ 1.18266818e+01,  1.18266818e+01,  1.18266818e+01, ...,
           1.18266818e+01,  1.18266818e+01,  1.18266818e+01],
         [ 1.16753217e+01,  1.16730877e+01,  1.16708536e+01, ...,
           1.16820240e+01,  1.16797899e+01,  1.16775558e+01],
         [ 1.15239616e+01,  1.15194935e+01,  1.15150254e+01, ...,
           1.15373661e+01,  1.15328979e+01,  1.15284298e+01],
         ...,
         [ 3.19371847e+00,  3.19442046e+00,  3.19512245e+00, ...,
           3.19161250e+00,  3.19231449e+00,  3.19301648e+00],
         [ 3.13864929e+00,  3.13900029e+00,  3.13935129e+00, ...,
           3.13759631e+00,  3.13794730e+00,  3.13829830e+00],
         [ 3.08358012e+00,  3.08358012e+00,  3.08358012e+00, ...,
           3.08358012e+00,  3.08358012e+00,  3.08358012e+00]]]])
Coordinates:
  * time      (time) datetime64[ns] 40B 2022-09-05 2022-09-05 ... 2022-09-05
  * variable  (variable) <U4 32B 'msl' 'tcwv'
  * lat       (lat) float64 6kB 90.0 89.75 89.5 89.25 ... -89.25 -89.5 -89.75
  * lon       (lon) float64 12kB 0.0 0.25 0.5 0.75 ... 359.0 359.2 359.5 359.8

Post Processing CBottle Data#

Let’s visualize this data to better understand what the cBottle data source is able to provide. It is clear that each sample is indeed unique, yet remains physically realizable. In other words the cBottle data source can be used to create climates that do not exist but could based on the conditional distribution learned from the training data.

import cartopy.crs as ccrs
import matplotlib.pyplot as plt

variable = "tcwv"

plt.close("all")
projection = ccrs.Orthographic(central_longitude=300.0)

# Create a figure and axes with the specified projection
fig, ax = plt.subplots(2, 3, subplot_kw={"projection": projection}, figsize=(11, 6))
ax = ax.flatten()

ax[0].pcolormesh(
    era5_da.coords["lon"],
    era5_da.coords["lat"],
    era5_da.sel(variable=variable).isel(time=0),
    transform=ccrs.PlateCarree(),
    cmap="cubehelix",
)
ax[0].set_title("ERA5")

for i in range(n_samples):
    ax[i + 1].pcolormesh(
        cbottle_da.coords["lon"],
        cbottle_da.coords["lat"],
        cbottle_da.sel(variable=variable).isel(time=i),
        transform=ccrs.PlateCarree(),
        cmap="cubehelix",
        vmin=0,
        vmax=90,
    )
    ax[i + 1].set_title(f"CBottle Sample {i}")

for ax0 in ax:
    ax0.coastlines()
    ax0.gridlines()

plt.tight_layout()
plt.savefig("outputs/12_tcwv_cbottle_datasource.jpg")
ERA5, CBottle Sample 0, CBottle Sample 1, CBottle Sample 2, CBottle Sample 3, CBottle Sample 4

Variable Infilling with CBottleInfill Diagnostic#

Next lets look at using the same model but for variable infilling. CBottleInfill allows users to generate global weather fields like the data source but condition it on a set of input fields that can be configured. This means that this diagnostic is extremely flexible and can be used with all types of data sources and models.

To demonstrate this lets consider two instances of the infilling diagnostic with a different set of inputs and then compare the resulting infilled variables. Note that the outputs of both configurations are the same size with the same variables.

import numpy as np

from earth2studio.data.utils import fetch_data

# Input variables
input_variables = ["u10m", "v10m"]

# Load the default model package which downloads the check point from NGC
package = CBottleInfill.load_default_package()
model = CBottleInfill.load_model(package, input_variables=input_variables)
model = model.to(device)

model.set_seed(0)
torch.manual_seed(0)
torch.cuda.manual_seed(0)

times = np.array([timestamp] * n_samples, dtype="datetime64[ns]")
x, coords = fetch_data(era5_ds, times, input_variables, device=device)
output_0, output_coords = model(x, coords)
print(output_0.shape)
Fetching WB2 data:   0%|          | 0/10 [00:00<?, ?it/s]

2025-06-16 19:22:58.893 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/10 [00:00<?, ?it/s]

2025-06-16 19:22:58.893 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/10 [00:00<?, ?it/s]

2025-06-16 19:22:58.894 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/10 [00:00<?, ?it/s]

2025-06-16 19:22:58.894 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/10 [00:00<?, ?it/s]

2025-06-16 19:22:58.895 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/10 [00:00<?, ?it/s]

2025-06-16 19:22:58.896 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/10 [00:00<?, ?it/s]

2025-06-16 19:22:58.896 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/10 [00:00<?, ?it/s]

2025-06-16 19:22:58.897 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/10 [00:00<?, ?it/s]

2025-06-16 19:22:58.897 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/10 [00:00<?, ?it/s]

2025-06-16 19:22:58.898 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/10 [00:00<?, ?it/s]
Fetching WB2 data: 100%|██████████| 10/10 [00:00<00:00, 401.10it/s]
torch.Size([5, 1, 45, 721, 1440])

Now repeat the process above but with an expanded set of variables. In this instance we provide a lot more data to the model to condition it with more information.

input_variables = [
    "u10m",
    "v10m",
    "t2m",
    "msl",
    "z50",
    "u50",
    "v50",
    "z500",
    "u500",
    "v500",
    "z1000",
    "u1000",
    "v1000",
]

# Load the default model package which downloads the check point from NGC
package = CBottleInfill.load_default_package()
model = CBottleInfill.load_model(package, input_variables=input_variables)
model = model.to(device)

model.set_seed(0)
torch.manual_seed(0)
torch.cuda.manual_seed(0)

x, coords = fetch_data(era5_ds, times, input_variables, device=device)
output_1, output_coords = model(x, coords)
print(output_1.shape)
Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.555 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: z1000 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.556 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u50 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.557 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: z50 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.557 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u500 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.558 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u1000 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.559 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.559 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v50 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.560 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: z1000 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.561 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u50 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.561 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.562 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: msl at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.563 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u1000 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.564 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u500 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.564 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: t2m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.565 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v1000 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.566 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.566 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: z1000 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.567 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u50 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.568 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v500 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.568 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.569 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: z50 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.569 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v500 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.570 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u1000 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.570 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u1000 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.571 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: z50 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.571 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: t2m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.572 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: t2m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.572 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v50 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.573 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u500 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.573 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u1000 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.574 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v50 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.574 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v1000 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.575 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u50 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.575 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.576 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: msl at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.576 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.577 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v500 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.577 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v50 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.578 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v500 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.578 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: z500 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.579 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: z50 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.579 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: t2m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.580 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u500 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.580 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v1000 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.581 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u50 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.581 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.582 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: msl at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.582 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: z500 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.583 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: z1000 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.583 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: msl at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.584 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.585 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u500 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.585 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v1000 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.586 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: z500 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.586 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: z50 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.587 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.587 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: t2m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.588 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: z500 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.588 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v500 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.589 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v1000 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.589 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: z1000 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.590 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: z500 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.590 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: msl at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.591 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: u10m at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]

2025-06-16 19:23:25.591 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: v50 at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/65 [00:00<?, ?it/s]
Fetching WB2 data:   2%|▏         | 1/65 [00:00<00:54,  1.18it/s]
Fetching WB2 data:  14%|█▍        | 9/65 [00:01<00:04, 11.26it/s]
Fetching WB2 data:  54%|█████▍    | 35/65 [00:01<00:00, 38.62it/s]
Fetching WB2 data: 100%|██████████| 65/65 [00:01<00:00, 48.36it/s]
torch.Size([5, 1, 45, 721, 1440])

Post Processing CBottleInfill#

To post process the results, we take a look at a infilled variable, total column water vapour. Compared to the samples from the CBottle3D, the results are much more aligned with the ground truth since the infill model is sampling a conditional distribution. Additionally, the model provided more variables is better aligned with the ground truth due the additional information provided.

variable = "tcwv"
var_idx = np.where(output_coords["variable"] == "tcwv")[0][0]
era5_data, _ = fetch_data(era5_ds, times[:1], [variable], device=device)

plt.close("all")
projection = ccrs.Mollweide(central_longitude=0)

# Create a figure and axes with the specified projection
fig, ax = plt.subplots(2, 3, subplot_kw={"projection": projection}, figsize=(10, 6))


def plot_contour(
    ax0: plt.axes,
    data: torch.Tensor,
    cmap: str = "jet",
    vrange: tuple[int, int] = (0, 90),
) -> None:
    """Contour helper"""
    ax0.contourf(
        output_coords["lon"],
        output_coords["lat"],
        data.cpu(),
        vmin=vrange[0],
        vmax=vrange[1],
        transform=ccrs.PlateCarree(),
        levels=20,
        cmap=cmap,
    )
    ax0.coastlines()
    ax0.gridlines()


plot_contour(ax[0, 0], era5_data[0, 0, 0])
plot_contour(ax[0, 1], torch.mean(output_0[:, 0, var_idx], axis=0))
plot_contour(ax[0, 2], torch.mean(output_1[:, 0, var_idx], axis=0))
plot_contour(
    ax[1, 1], torch.std(output_0[:, 0, var_idx], axis=0), cmap="inferno", vrange=(0, 10)
)
plot_contour(
    ax[1, 2], torch.std(output_1[:, 0, var_idx], axis=0), cmap="inferno", vrange=(0, 10)
)

ax[0, 0].set_title("ERA5")
ax[0, 1].set_title("3 Input Variables Mean")
ax[0, 2].set_title("13 Input Variables Mean")
ax[1, 1].set_title("3 Input Variables Std")
ax[1, 2].set_title("13 Input Variables Std")

plt.tight_layout()
plt.savefig("outputs/12_tcwv_cbottle_infill.jpg")
ERA5, 3 Input Variables Mean, 13 Input Variables Mean, 3 Input Variables Std, 13 Input Variables Std
Fetching WB2 data:   0%|          | 0/1 [00:00<?, ?it/s]

2025-06-16 19:23:50.335 | DEBUG    | earth2studio.data.wb2:fetch_array:251 - Fetching WB2 zarr array for variable: tcwv at 2022-09-05T00:00:00

Fetching WB2 data:   0%|          | 0/1 [00:00<?, ?it/s]
Fetching WB2 data: 100%|██████████| 1/1 [00:00<00:00, 268.81it/s]

Total running time of the script: (2 minutes 43.852 seconds)

Gallery generated by Sphinx-Gallery