IO Backend Performance#

Leverage different IO backends for storing inference results.

This example explores IO backends inside Earth2Studio and how they can be used to write data to different formats / locations. The IO is a core part of any inference pipeline and depending on the desired target, can dramatically impact performance. This example will help navigate users through the use of different IO backend APIs in a simple workflow.

In this example you will learn:

  • Initializing, creating arrays and writing with the Zarr IO backend

  • Initializing, creating arrays and writing with the NetCDF IO backend

  • Initializing and writing with the Asynchronous Non-blocking Zarr IO backend

  • Discussing performance implications and strategies that can be used

Set Up#

To demonstrate different IO, this example will use a simple ensemble workflow that we will manually create ourselves. One could use the built in workflow in Earth2Studio however, this will allow us to better understand the APIs.

We need the following components:

import os

os.makedirs("outputs", exist_ok=True)
from dotenv import load_dotenv

load_dotenv()  # TODO: make common example prep function

import torch

from earth2studio.data import GFS, DataSource, fetch_data
from earth2studio.io import AsyncZarrBackend, IOBackend, NetCDF4Backend, ZarrBackend
from earth2studio.models.px import DLWP, PrognosticModel
from earth2studio.perturbation import Gaussian, Perturbation

# Get the device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load the cBottle data source
package = DLWP.load_default_package()
model = DLWP.load_model(package)
model = model.to(device)

# Create the ERA5 data source
ds = GFS()

# Create perturbation method
pt = Gaussian()

Creating a Simple Ensemble Workflow#

Start with creating a simple ensemble inference workflow. This is essentially a simpler version of the built in ensemble workflow earth2studio.run.ensemble(). In this case, this is for an ensemble inference workflow that will predict a 5 day forecast for Christmas 2022. Following standard Earth2Studio practices, the function accepts initialized prognostic, data source, io backend and perturbation method.

import os
import time
from datetime import datetime, timedelta

import numpy as np
from tqdm import tqdm

from earth2studio.utils.coords import map_coords, split_coords
from earth2studio.utils.time import to_time_array

times = [datetime(2022, 12, 20)]
nsteps = 20  # Assuming 6-hour time steps


def christmas_five_day_ensemble(
    times: list[datetime],
    nsteps: int,
    prognostic: PrognosticModel,
    data: DataSource,
    io: IOBackend,
    perturbation: Perturbation,
    nensemble: int = 8,
    device: str = "cuda",
) -> None:
    """Ensemble inference example"""
    # ==========================================
    # Fetch Initialization Data
    prognostic_ic = prognostic.input_coords()
    times = to_time_array(times)

    x, coords0 = fetch_data(
        source=data,
        time=times,
        variable=prognostic_ic["variable"],
        lead_time=prognostic_ic["lead_time"],
        device=device,
    )
    # ==========================================
    # ==========================================
    # Set up IO backend by pre-allocating arrays (not needed for AsyncZarrBackend)
    total_coords = prognostic.output_coords(prognostic.input_coords()).copy()
    if "batch" in total_coords:
        del total_coords["batch"]
    total_coords["time"] = times
    total_coords["lead_time"] = np.asarray(
        [
            prognostic.output_coords(prognostic.input_coords())["lead_time"] * i
            for i in range(nsteps + 1)
        ]
    ).flatten()
    total_coords.move_to_end("lead_time", last=False)
    total_coords.move_to_end("time", last=False)
    total_coords = {"ensemble": np.arange(nensemble)} | total_coords

    variables_to_save = total_coords.pop("variable")
    io.add_array(total_coords, variables_to_save)
    # ==========================================
    # ==========================================
    # Run inference
    coords = {"ensemble": np.arange(nensemble)} | coords0.copy()
    x = x.unsqueeze(0).repeat(nensemble, *([1] * x.ndim))

    # Map lat and lon if needed
    x, coords = map_coords(x, coords, prognostic_ic)

    # Perturb ensemble
    x, coords = perturbation(x, coords)

    # Create prognostic iterator
    model = prognostic.create_iterator(x, coords)

    with tqdm(
        total=nsteps + 1,
        desc="Running batch inference",
        position=1,
        leave=False,
    ) as pbar:
        for step, (x, coords) in enumerate(model):
            # Dump result to IO, split_coords separates variables to different arrays
            x, coords = map_coords(x, coords, {"variable": np.array(["t2m", "tcwv"])})
            io.write(*split_coords(x, coords))
            pbar.update(1)
            if step == nsteps:
                break
    # ==========================================


def get_folder_size(folder_path: str) -> int:
    """Get folder size in megabytes"""
    if os.path.isfile(folder_path):
        return os.path.getsize(folder_path) / (1024 * 1024)

    total_size = 0
    for dirpath, dirnames, filenames in os.walk(folder_path):
        for filename in filenames:
            file_path = os.path.join(dirpath, filename)
            total_size += os.path.getsize(file_path)
    return total_size / (1024 * 1024)

Local Storage Zarr IO#

As a base line, lets run the Zarr IO backend saving it to local disk. Local IO storage is typically preferred since we can then access the data after the inference pipeline is finished using standard libraries. Chunking play an important role on performance, both with respect to compression and also when accessing data. Here we will chunk the output data based on time and lead_time

io = ZarrBackend(
    "outputs/17_io_sync.zarr",
    chunks={"time": 1, "lead_time": 1},
    backend_kwargs={"overwrite": True},
)

start_time = time.time()
christmas_five_day_ensemble(times, nsteps, model, ds, io, pt, device=device)
zarr_local_clock = time.time() - start_time
Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:42.442 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 422682974-1171397

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:42.445 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 209450878-718645

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:42.447 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 295463736-857297

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:42.449 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 251866913-804464

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:42.450 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 410101849-875378

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:42.452 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 330483332-833835

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:42.454 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 399402626-989950

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]
Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:02,  2.03it/s]
Fetching GFS data:  29%|██▊       | 2/7 [00:00<00:01,  3.78it/s]
Fetching GFS data:  86%|████████▌ | 6/7 [00:00<00:00, 11.32it/s]
Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00,  9.44it/s]

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:43.568 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 423575610-1170377

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:43.570 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 210327553-720496

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:43.571 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 331696727-829753

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:43.573 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 400334303-985030

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:43.575 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 296814817-852960

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:43.576 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 253345828-804505

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:43.578 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 410972364-870639

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]
Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:02,  2.07it/s]
Fetching GFS data:  29%|██▊       | 2/7 [00:00<00:01,  3.38it/s]
Fetching GFS data:  43%|████▎     | 3/7 [00:00<00:00,  4.62it/s]
Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00,  9.09it/s]
2025-07-09 03:51:44.433 | WARNING  | earth2studio.io.zarr:add_array:200 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch
2025-07-09 03:51:44.437 | WARNING  | earth2studio.io.zarr:add_array:206 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]

Running batch inference:   5%|▍         | 1/21 [00:00<00:12,  1.55it/s]

Running batch inference:  10%|▉         | 2/21 [00:01<00:12,  1.47it/s]

Running batch inference:  14%|█▍        | 3/21 [00:02<00:12,  1.47it/s]

Running batch inference:  19%|█▉        | 4/21 [00:02<00:11,  1.50it/s]

Running batch inference:  24%|██▍       | 5/21 [00:03<00:10,  1.52it/s]

Running batch inference:  29%|██▊       | 6/21 [00:03<00:09,  1.51it/s]

Running batch inference:  33%|███▎      | 7/21 [00:04<00:09,  1.50it/s]

Running batch inference:  38%|███▊      | 8/21 [00:05<00:08,  1.48it/s]

Running batch inference:  43%|████▎     | 9/21 [00:06<00:08,  1.49it/s]

Running batch inference:  48%|████▊     | 10/21 [00:06<00:07,  1.48it/s]

Running batch inference:  52%|█████▏    | 11/21 [00:07<00:06,  1.49it/s]

Running batch inference:  57%|█████▋    | 12/21 [00:08<00:06,  1.49it/s]

Running batch inference:  62%|██████▏   | 13/21 [00:08<00:05,  1.49it/s]

Running batch inference:  67%|██████▋   | 14/21 [00:09<00:04,  1.49it/s]

Running batch inference:  71%|███████▏  | 15/21 [00:10<00:04,  1.50it/s]

Running batch inference:  76%|███████▌  | 16/21 [00:10<00:03,  1.50it/s]

Running batch inference:  81%|████████  | 17/21 [00:11<00:02,  1.52it/s]

Running batch inference:  86%|████████▌ | 18/21 [00:12<00:01,  1.52it/s]

Running batch inference:  90%|█████████ | 19/21 [00:12<00:01,  1.52it/s]

Running batch inference:  95%|█████████▌| 20/21 [00:13<00:00,  1.53it/s]

Running batch inference: 100%|██████████| 21/21 [00:13<00:00,  1.54it/s]
print(f"\nLocal zarr store inference time: {zarr_local_clock}s")
print(
    f"Uncompressed zarr store size: {get_folder_size('outputs/17_io_sync.zarr'):.2f} MB"
)
Local zarr store inference time: 16.309239625930786s
Uncompressed zarr store size: 1330.78 MB

Compressed Local Storage Zarr IO#

By default the Zarr IO backends will be uncompressed. In many instances this is fine, when data volumes are low. However, in instances that we are writing a very large amount of data or the data needs to get sent over the network to a remote store, compression is essential. With the standard Zarr backend, this will cause a very noticeable slow down, but note that the output store will be 3x smaller!

import zarr

io = ZarrBackend(
    "outputs/17_io_sync_compressed.zarr",
    chunks={"time": 1, "lead_time": 1},
    backend_kwargs={"overwrite": True},
    zarr_codecs=zarr.codecs.BloscCodec(
        cname="zstd", clevel=3, shuffle=zarr.codecs.BloscShuffle.shuffle
    ),  # Zarrs default
)

start_time = time.time()
christmas_five_day_ensemble(times, nsteps, model, ds, io, pt, device=device)
zarr_local_clock = time.time() - start_time
Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:58.430 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 295463736-857297

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:58.453 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 330483332-833835

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:58.476 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 422682974-1171397

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:58.499 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 209450878-718645

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:58.521 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 251866913-804464

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:58.543 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 399402626-989950

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:58.565 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 410101849-875378

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]
Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  6.36it/s]
Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 44.45it/s]

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:58.595 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 331696727-829753

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:58.618 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 423575610-1170377

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:58.640 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 400334303-985030

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:58.662 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 296814817-852960

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:58.685 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 410972364-870639

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:58.707 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 253345828-804505

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:51:58.730 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 210327553-720496

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]
Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  6.34it/s]
Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 44.34it/s]
2025-07-09 03:51:58.783 | WARNING  | earth2studio.io.zarr:add_array:200 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch
2025-07-09 03:51:58.787 | WARNING  | earth2studio.io.zarr:add_array:206 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]

Running batch inference:   5%|▍         | 1/21 [00:01<00:20,  1.04s/it]

Running batch inference:  10%|▉         | 2/21 [00:02<00:20,  1.09s/it]

Running batch inference:  14%|█▍        | 3/21 [00:03<00:19,  1.09s/it]

Running batch inference:  19%|█▉        | 4/21 [00:04<00:18,  1.10s/it]

Running batch inference:  24%|██▍       | 5/21 [00:05<00:17,  1.10s/it]

Running batch inference:  29%|██▊       | 6/21 [00:06<00:16,  1.12s/it]

Running batch inference:  33%|███▎      | 7/21 [00:07<00:15,  1.12s/it]

Running batch inference:  38%|███▊      | 8/21 [00:08<00:14,  1.12s/it]

Running batch inference:  43%|████▎     | 9/21 [00:09<00:13,  1.11s/it]

Running batch inference:  48%|████▊     | 10/21 [00:11<00:12,  1.11s/it]

Running batch inference:  52%|█████▏    | 11/21 [00:12<00:11,  1.11s/it]

Running batch inference:  57%|█████▋    | 12/21 [00:13<00:10,  1.12s/it]

Running batch inference:  62%|██████▏   | 13/21 [00:14<00:08,  1.12s/it]

Running batch inference:  67%|██████▋   | 14/21 [00:15<00:07,  1.11s/it]

Running batch inference:  71%|███████▏  | 15/21 [00:16<00:06,  1.11s/it]

Running batch inference:  76%|███████▌  | 16/21 [00:17<00:05,  1.12s/it]

Running batch inference:  81%|████████  | 17/21 [00:18<00:04,  1.12s/it]

Running batch inference:  86%|████████▌ | 18/21 [00:19<00:03,  1.12s/it]

Running batch inference:  90%|█████████ | 19/21 [00:21<00:02,  1.12s/it]

Running batch inference:  95%|█████████▌| 20/21 [00:22<00:01,  1.12s/it]

Running batch inference: 100%|██████████| 21/21 [00:23<00:00,  1.11s/it]
print(f"\nLocal compressed zarr store inference time: {zarr_local_clock}s")
print(
    f"Compressed zarr store size: {get_folder_size('outputs/17_io_sync_compressed.zarr'):.2f} MB"
)
Local compressed zarr store inference time: 23.72968864440918s
Compressed zarr store size: 413.08 MB

Local Storage NetCDF IO#

NetCDF offers a similar user experience but saves the output into a single netCDF file. For local storage, NetCDF it typically preferred since it keeps all outputs into a single file.

io = NetCDF4Backend("outputs/17_io_sync.nc", backend_kwargs={"mode": "w"})
start_time = time.time()
christmas_five_day_ensemble(times, nsteps, model, ds, io, pt, device=device)
nc_local_clock = time.time() - start_time
Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:22.164 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 330483332-833835

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:22.187 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 295463736-857297

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:22.209 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 251866913-804464

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:22.232 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 399402626-989950

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:22.254 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 209450878-718645

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:22.277 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 410101849-875378

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:22.298 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 422682974-1171397

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]
Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  6.37it/s]
Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 44.53it/s]

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:22.329 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 400334303-985030

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:22.351 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 253345828-804505

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:22.373 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 423575610-1170377

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:22.396 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 296814817-852960

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:22.419 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 331696727-829753

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:22.441 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 410972364-870639

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:22.463 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 210327553-720496

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]
Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  6.33it/s]
Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 44.28it/s]


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]

Running batch inference:   5%|▍         | 1/21 [00:00<00:16,  1.22it/s]

Running batch inference:  19%|█▉        | 4/21 [00:00<00:03,  5.29it/s]

Running batch inference:  33%|███▎      | 7/21 [00:01<00:01,  9.46it/s]

Running batch inference:  48%|████▊     | 10/21 [00:01<00:00, 12.93it/s]

Running batch inference:  62%|██████▏   | 13/21 [00:01<00:00, 16.50it/s]

Running batch inference:  76%|███████▌  | 16/21 [00:01<00:00, 18.77it/s]

Running batch inference:  90%|█████████ | 19/21 [00:01<00:00, 21.43it/s]
print(f"\nLocal netcdf store inference time: {nc_local_clock}s")
print(
    f"Uncompressed zarr store size: {get_folder_size('outputs/17_io_sync.nc'):.2f} MB"
)
Local netcdf store inference time: 1.9192547798156738s
Uncompressed zarr store size: 1330.79 MB

In Memory Zarr IO#

One way we can speed up IO is to save outputs to in-memory stores. In-memory stores more limited in size depending on the hardware being used. Also one needs to be careful with in memory stores, once the Python object is deleted the data is gone.

io = ZarrBackend(
    chunks={"time": 1, "lead_time": 1}, backend_kwargs={"overwrite": True}
)  # Not path = in memory for Zarr
start_time = time.time()
christmas_five_day_ensemble(times, nsteps, model, ds, io, pt, device=device)
zarr_memory_clock = time.time() - start_time
Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:24.085 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 330483332-833835

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:24.109 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 399402626-989950

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:24.132 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 295463736-857297

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:24.154 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 422682974-1171397

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:24.176 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 251866913-804464

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:24.198 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 209450878-718645

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:24.221 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 410101849-875378

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]
Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  6.29it/s]
Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 43.97it/s]

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:24.252 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 410972364-870639

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:24.274 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 253345828-804505

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:24.296 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 400334303-985030

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:24.318 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 210327553-720496

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:24.341 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 423575610-1170377

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:24.363 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 296814817-852960

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:24.386 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 331696727-829753

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]
Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  6.38it/s]
Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 44.58it/s]
2025-07-09 03:52:24.436 | WARNING  | earth2studio.io.zarr:add_array:200 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch
2025-07-09 03:52:24.439 | WARNING  | earth2studio.io.zarr:add_array:206 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]

Running batch inference:   5%|▍         | 1/21 [00:00<00:11,  1.76it/s]

Running batch inference:  10%|▉         | 2/21 [00:01<00:10,  1.75it/s]

Running batch inference:  14%|█▍        | 3/21 [00:01<00:10,  1.77it/s]

Running batch inference:  19%|█▉        | 4/21 [00:02<00:09,  1.76it/s]

Running batch inference:  24%|██▍       | 5/21 [00:02<00:09,  1.76it/s]

Running batch inference:  29%|██▊       | 6/21 [00:03<00:08,  1.75it/s]

Running batch inference:  33%|███▎      | 7/21 [00:03<00:07,  1.76it/s]

Running batch inference:  38%|███▊      | 8/21 [00:04<00:07,  1.74it/s]

Running batch inference:  43%|████▎     | 9/21 [00:05<00:06,  1.75it/s]

Running batch inference:  48%|████▊     | 10/21 [00:05<00:06,  1.74it/s]

Running batch inference:  52%|█████▏    | 11/21 [00:06<00:05,  1.74it/s]

Running batch inference:  57%|█████▋    | 12/21 [00:06<00:05,  1.73it/s]

Running batch inference:  62%|██████▏   | 13/21 [00:07<00:04,  1.74it/s]

Running batch inference:  67%|██████▋   | 14/21 [00:08<00:04,  1.74it/s]

Running batch inference:  71%|███████▏  | 15/21 [00:08<00:03,  1.75it/s]

Running batch inference:  76%|███████▌  | 16/21 [00:09<00:02,  1.75it/s]

Running batch inference:  81%|████████  | 17/21 [00:09<00:02,  1.75it/s]

Running batch inference:  86%|████████▌ | 18/21 [00:10<00:01,  1.75it/s]

Running batch inference:  90%|█████████ | 19/21 [00:10<00:01,  1.75it/s]

Running batch inference:  95%|█████████▌| 20/21 [00:11<00:00,  1.74it/s]

Running batch inference: 100%|██████████| 21/21 [00:12<00:00,  1.76it/s]
print(f"\nIn memory zarr store inference time: {zarr_memory_clock}s")
In memory zarr store inference time: 12.381650686264038s

Compressed Local Async Zarr IO#

The async Zarr IO backend is an advanced IO backend designed to offer async Zarr 3.0 writes to in-memory, local and remote data stores. This data source is ideal when large volumes of data are needed to be written and the users want to mask the IO with the forward execution of the model.

Because this IO backend relies on both async and multi-threading, it has a different initialization pattern than others. The main difference being that this backend does not use the add_array API, rather users specify parallel_coords in the constructor that denote coords that slices will be written to during inference. Typically this might be time, lead_time and ensemble.

parallel_coords = {
    "time": np.asarray(times),
    "lead_time": np.asarray([timedelta(hours=6 * i) for i in range(nsteps + 1)]),
}
io = AsyncZarrBackend(
    "outputs/17_io_async.zarr",
    parallel_coords=parallel_coords,
    zarr_codecs=zarr.codecs.BloscCodec(
        cname="zstd", clevel=3, shuffle=zarr.codecs.BloscShuffle.shuffle
    ),
)
start_time = time.time()
christmas_five_day_ensemble(times, nsteps, model, ds, io, pt, device=device)
zarr_async_clock = time.time() - start_time
2025-07-09 03:52:36.461 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch
2025-07-09 03:52:36.461 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch
2025-07-09 03:52:36.462 | DEBUG    | earth2studio.io.async_zarr:__init__:154 - Setting up Zarr object pool of size 1, may take a bit

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:36.484 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 295463736-857297

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:36.508 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 399402626-989950

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:36.530 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 209450878-718645

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:36.553 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 330483332-833835

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:36.576 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 251866913-804464

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:36.597 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 422682974-1171397

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:36.620 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 410101849-875378

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]
Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  6.33it/s]
Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 44.22it/s]

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:36.650 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 410972364-870639

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:36.672 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 253345828-804505

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:36.694 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 423575610-1170377

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:36.717 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 331696727-829753

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:36.739 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 210327553-720496

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:36.761 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 400334303-985030

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:36.783 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 296814817-852960

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]
Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  6.41it/s]
Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 44.80it/s]


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:36.837 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:36.837 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:36.838 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:298 - Writing coordinate array ensemble to zarr store


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:36.841 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:298 - Writing coordinate array time to zarr store


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:36.844 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:298 - Writing coordinate array lead_time to zarr store


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:36.847 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:298 - Writing coordinate array lat to zarr store


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:36.850 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:298 - Writing coordinate array lon to zarr store


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:36.853 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:332 - Initializing array t2m with shape (8, 1, 21, 721, 1440) with chunks (8, 1, 1, 721, 1440) dtype <class 'numpy.float32'>


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:36.854 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:332 - Initializing array tcwv with shape (8, 1, 21, 721, 1440) with chunks (8, 1, 1, 721, 1440) dtype <class 'numpy.float32'>


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:36.897 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]

Running batch inference:   5%|▍         | 1/21 [00:00<00:08,  2.28it/s]


2025-07-09 03:52:37.305 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:   5%|▍         | 1/21 [00:00<00:08,  2.28it/s]


2025-07-09 03:52:37.305 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:   5%|▍         | 1/21 [00:00<00:08,  2.28it/s]


2025-07-09 03:52:37.355 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:   5%|▍         | 1/21 [00:00<00:08,  2.28it/s]

Running batch inference:  10%|▉         | 2/21 [00:00<00:08,  2.35it/s]


2025-07-09 03:52:37.698 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  10%|▉         | 2/21 [00:00<00:08,  2.35it/s]


2025-07-09 03:52:37.699 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  10%|▉         | 2/21 [00:00<00:08,  2.35it/s]


2025-07-09 03:52:37.776 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  10%|▉         | 2/21 [00:00<00:08,  2.35it/s]

Running batch inference:  14%|█▍        | 3/21 [00:01<00:07,  2.44it/s]


2025-07-09 03:52:38.108 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  14%|█▍        | 3/21 [00:01<00:07,  2.44it/s]


2025-07-09 03:52:38.108 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  14%|█▍        | 3/21 [00:01<00:07,  2.44it/s]


2025-07-09 03:52:38.183 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  14%|█▍        | 3/21 [00:01<00:07,  2.44it/s]

Running batch inference:  19%|█▉        | 4/21 [00:01<00:06,  2.44it/s]


2025-07-09 03:52:38.497 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  19%|█▉        | 4/21 [00:01<00:06,  2.44it/s]


2025-07-09 03:52:38.498 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  19%|█▉        | 4/21 [00:01<00:06,  2.44it/s]


2025-07-09 03:52:38.573 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  19%|█▉        | 4/21 [00:01<00:06,  2.44it/s]

Running batch inference:  24%|██▍       | 5/21 [00:02<00:06,  2.51it/s]


2025-07-09 03:52:38.893 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  24%|██▍       | 5/21 [00:02<00:06,  2.51it/s]


2025-07-09 03:52:38.893 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  24%|██▍       | 5/21 [00:02<00:06,  2.51it/s]


2025-07-09 03:52:38.945 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  24%|██▍       | 5/21 [00:02<00:06,  2.51it/s]

Running batch inference:  29%|██▊       | 6/21 [00:02<00:05,  2.54it/s]


2025-07-09 03:52:39.258 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  29%|██▊       | 6/21 [00:02<00:05,  2.54it/s]


2025-07-09 03:52:39.259 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  29%|██▊       | 6/21 [00:02<00:05,  2.54it/s]


2025-07-09 03:52:39.310 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  29%|██▊       | 6/21 [00:02<00:05,  2.54it/s]

Running batch inference:  33%|███▎      | 7/21 [00:02<00:05,  2.61it/s]


2025-07-09 03:52:39.641 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  33%|███▎      | 7/21 [00:02<00:05,  2.61it/s]


2025-07-09 03:52:39.641 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  33%|███▎      | 7/21 [00:02<00:05,  2.61it/s]


2025-07-09 03:52:39.694 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  33%|███▎      | 7/21 [00:02<00:05,  2.61it/s]

Running batch inference:  38%|███▊      | 8/21 [00:03<00:04,  2.62it/s]


2025-07-09 03:52:40.001 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  38%|███▊      | 8/21 [00:03<00:04,  2.62it/s]


2025-07-09 03:52:40.001 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  38%|███▊      | 8/21 [00:03<00:04,  2.62it/s]


2025-07-09 03:52:40.052 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  38%|███▊      | 8/21 [00:03<00:04,  2.62it/s]

Running batch inference:  43%|████▎     | 9/21 [00:03<00:04,  2.63it/s]


2025-07-09 03:52:40.396 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  43%|████▎     | 9/21 [00:03<00:04,  2.63it/s]


2025-07-09 03:52:40.396 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  43%|████▎     | 9/21 [00:03<00:04,  2.63it/s]


2025-07-09 03:52:40.449 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  43%|████▎     | 9/21 [00:03<00:04,  2.63it/s]

Running batch inference:  48%|████▊     | 10/21 [00:03<00:04,  2.64it/s]


2025-07-09 03:52:40.751 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  48%|████▊     | 10/21 [00:03<00:04,  2.64it/s]


2025-07-09 03:52:40.751 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  48%|████▊     | 10/21 [00:03<00:04,  2.64it/s]


2025-07-09 03:52:40.801 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  48%|████▊     | 10/21 [00:03<00:04,  2.64it/s]

Running batch inference:  52%|█████▏    | 11/21 [00:04<00:03,  2.70it/s]


2025-07-09 03:52:41.121 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  52%|█████▏    | 11/21 [00:04<00:03,  2.70it/s]


2025-07-09 03:52:41.121 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  52%|█████▏    | 11/21 [00:04<00:03,  2.70it/s]


2025-07-09 03:52:41.177 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  52%|█████▏    | 11/21 [00:04<00:03,  2.70it/s]

Running batch inference:  57%|█████▋    | 12/21 [00:04<00:03,  2.63it/s]


2025-07-09 03:52:41.504 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  57%|█████▋    | 12/21 [00:04<00:03,  2.63it/s]


2025-07-09 03:52:41.504 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  57%|█████▋    | 12/21 [00:04<00:03,  2.63it/s]


2025-07-09 03:52:41.553 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  57%|█████▋    | 12/21 [00:04<00:03,  2.63it/s]

Running batch inference:  62%|██████▏   | 13/21 [00:05<00:03,  2.66it/s]


2025-07-09 03:52:41.889 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  62%|██████▏   | 13/21 [00:05<00:03,  2.66it/s]


2025-07-09 03:52:41.890 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  62%|██████▏   | 13/21 [00:05<00:03,  2.66it/s]


2025-07-09 03:52:41.943 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  62%|██████▏   | 13/21 [00:05<00:03,  2.66it/s]

Running batch inference:  67%|██████▋   | 14/21 [00:05<00:02,  2.63it/s]


2025-07-09 03:52:42.261 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  67%|██████▋   | 14/21 [00:05<00:02,  2.63it/s]


2025-07-09 03:52:42.261 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  67%|██████▋   | 14/21 [00:05<00:02,  2.63it/s]


2025-07-09 03:52:42.314 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  67%|██████▋   | 14/21 [00:05<00:02,  2.63it/s]

Running batch inference:  71%|███████▏  | 15/21 [00:05<00:02,  2.67it/s]


2025-07-09 03:52:42.663 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  71%|███████▏  | 15/21 [00:05<00:02,  2.67it/s]


2025-07-09 03:52:42.664 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  71%|███████▏  | 15/21 [00:05<00:02,  2.67it/s]


2025-07-09 03:52:42.722 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  71%|███████▏  | 15/21 [00:05<00:02,  2.67it/s]

Running batch inference:  76%|███████▌  | 16/21 [00:06<00:01,  2.56it/s]


2025-07-09 03:52:43.051 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  76%|███████▌  | 16/21 [00:06<00:01,  2.56it/s]


2025-07-09 03:52:43.051 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  76%|███████▌  | 16/21 [00:06<00:01,  2.56it/s]


2025-07-09 03:52:43.111 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  76%|███████▌  | 16/21 [00:06<00:01,  2.56it/s]

Running batch inference:  81%|████████  | 17/21 [00:06<00:01,  2.59it/s]


2025-07-09 03:52:43.465 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  81%|████████  | 17/21 [00:06<00:01,  2.59it/s]


2025-07-09 03:52:43.466 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  81%|████████  | 17/21 [00:06<00:01,  2.59it/s]


2025-07-09 03:52:43.523 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  81%|████████  | 17/21 [00:06<00:01,  2.59it/s]

Running batch inference:  86%|████████▌ | 18/21 [00:07<00:01,  2.51it/s]


2025-07-09 03:52:43.850 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  86%|████████▌ | 18/21 [00:07<00:01,  2.51it/s]


2025-07-09 03:52:43.850 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  86%|████████▌ | 18/21 [00:07<00:01,  2.51it/s]


2025-07-09 03:52:43.910 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  86%|████████▌ | 18/21 [00:07<00:01,  2.51it/s]

Running batch inference:  90%|█████████ | 19/21 [00:07<00:00,  2.56it/s]


2025-07-09 03:52:44.267 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  90%|█████████ | 19/21 [00:07<00:00,  2.56it/s]


2025-07-09 03:52:44.267 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  90%|█████████ | 19/21 [00:07<00:00,  2.56it/s]


2025-07-09 03:52:44.326 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  90%|█████████ | 19/21 [00:07<00:00,  2.56it/s]

Running batch inference:  95%|█████████▌| 20/21 [00:07<00:00,  2.49it/s]


2025-07-09 03:52:44.652 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  95%|█████████▌| 20/21 [00:07<00:00,  2.49it/s]


2025-07-09 03:52:44.653 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  95%|█████████▌| 20/21 [00:07<00:00,  2.49it/s]


2025-07-09 03:52:44.712 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  95%|█████████▌| 20/21 [00:07<00:00,  2.49it/s]

Running batch inference: 100%|██████████| 21/21 [00:08<00:00,  2.54it/s]
print(f"\nAsync zarr store inference time: {zarr_async_clock}s")
print(
    f"Compressed async zarr store size: {get_folder_size('outputs/17_io_async.zarr'):.2f} MB"
)
Async zarr store inference time: 8.538628339767456s
Compressed async zarr store size: 413.07 MB

Compressed Local Non-Blocking Async Zarr IO#

That was faster than the normal Zarr method, even the uncompressed version making it comparable to NetCDF, but we can still improve with this IO backend. A unique feature of this particular backend is running in non-blocking mode, namely IO writes will be placed onto other threads. Users do need to be careful with this to both ensure data is not mutated while the IO backend is working to move the data off the GPU, but also to make sure to wait for write threads to finish before the object is deleted.

Note that this backend allows Zarr to be comparable to uncompressed NetCDF even 3x compression!

io = AsyncZarrBackend(
    "outputs/17_io_nonblocking_async.zarr",
    parallel_coords=parallel_coords,
    blocking=False,
    zarr_codecs=zarr.codecs.BloscCodec(
        cname="zstd", clevel=3, shuffle=zarr.codecs.BloscShuffle.shuffle
    ),
)
start_time = time.time()
christmas_five_day_ensemble(times, nsteps, model, ds, io, pt, device=device)
# IMPORTANT: Make sure to call close to ensure IO backend threads have finished!
io.close()
zarr_nonblocking_async_clock = time.time() - start_time
2025-07-09 03:52:45.029 | DEBUG    | earth2studio.io.async_zarr:__init__:154 - Setting up Zarr object pool of size 8, may take a bit

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:45.054 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 209450878-718645

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:45.078 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 410101849-875378

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:45.101 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 295463736-857297

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:45.124 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 399402626-989950

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:45.146 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 422682974-1171397

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:45.169 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 330483332-833835

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:45.191 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 251866913-804464

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]
Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  6.28it/s]
Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 43.89it/s]

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:45.221 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 423575610-1170377

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:45.244 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 331696727-829753

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:45.267 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 410972364-870639

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:45.290 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 400334303-985030

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:45.312 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 296814817-852960

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:45.335 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 253345828-804505

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:52:45.357 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 210327553-720496

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]
Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  6.32it/s]
Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 44.17it/s]


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:45.410 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:45.411 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:45.411 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:298 - Writing coordinate array ensemble to zarr store


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:45.414 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:298 - Writing coordinate array time to zarr store


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:45.417 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:298 - Writing coordinate array lead_time to zarr store


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:45.420 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:298 - Writing coordinate array lat to zarr store


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:45.423 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:298 - Writing coordinate array lon to zarr store


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:45.426 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:332 - Initializing array t2m with shape (8, 1, 21, 721, 1440) with chunks (8, 1, 1, 721, 1440) dtype <class 'numpy.float32'>


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:45.427 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:332 - Initializing array tcwv with shape (8, 1, 21, 721, 1440) with chunks (8, 1, 1, 721, 1440) dtype <class 'numpy.float32'>


Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]


2025-07-09 03:52:45.475 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:   5%|▍         | 1/21 [00:00<00:01, 14.34it/s]


2025-07-09 03:52:45.509 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:   5%|▍         | 1/21 [00:00<00:02,  9.72it/s]


2025-07-09 03:52:45.509 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:   5%|▍         | 1/21 [00:00<00:02,  9.68it/s]

Running batch inference:  10%|▉         | 2/21 [00:00<00:01, 17.56it/s]


2025-07-09 03:52:45.562 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  10%|▉         | 2/21 [00:00<00:01, 17.56it/s]


2025-07-09 03:52:45.584 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  10%|▉         | 2/21 [00:00<00:01, 17.56it/s]


2025-07-09 03:52:45.585 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  10%|▉         | 2/21 [00:00<00:01, 17.56it/s]


2025-07-09 03:52:45.664 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  14%|█▍        | 3/21 [00:00<00:01, 17.56it/s]


2025-07-09 03:52:45.697 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  14%|█▍        | 3/21 [00:00<00:01, 17.56it/s]


2025-07-09 03:52:45.698 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  14%|█▍        | 3/21 [00:00<00:01, 17.56it/s]

Running batch inference:  19%|█▉        | 4/21 [00:00<00:01, 12.66it/s]


2025-07-09 03:52:45.749 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  19%|█▉        | 4/21 [00:00<00:01, 12.66it/s]


2025-07-09 03:52:45.757 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  19%|█▉        | 4/21 [00:00<00:01, 12.66it/s]


2025-07-09 03:52:45.757 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  19%|█▉        | 4/21 [00:00<00:01, 12.66it/s]


2025-07-09 03:52:45.815 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  24%|██▍       | 5/21 [00:00<00:01, 12.66it/s]


2025-07-09 03:52:45.860 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  24%|██▍       | 5/21 [00:00<00:01, 12.66it/s]


2025-07-09 03:52:45.861 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  24%|██▍       | 5/21 [00:00<00:01, 12.66it/s]

Running batch inference:  29%|██▊       | 6/21 [00:00<00:01, 12.57it/s]


2025-07-09 03:52:45.909 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  29%|██▊       | 6/21 [00:00<00:01, 12.57it/s]


2025-07-09 03:52:45.917 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  29%|██▊       | 6/21 [00:00<00:01, 12.57it/s]


2025-07-09 03:52:45.917 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  29%|██▊       | 6/21 [00:00<00:01, 12.57it/s]


2025-07-09 03:52:45.971 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  33%|███▎      | 7/21 [00:00<00:01, 12.57it/s]


2025-07-09 03:52:46.017 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  33%|███▎      | 7/21 [00:00<00:01, 12.57it/s]


2025-07-09 03:52:46.017 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  33%|███▎      | 7/21 [00:00<00:01, 12.57it/s]

Running batch inference:  38%|███▊      | 8/21 [00:00<00:01, 12.31it/s]


2025-07-09 03:52:46.082 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  38%|███▊      | 8/21 [00:00<00:01, 12.31it/s]


2025-07-09 03:52:46.090 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  38%|███▊      | 8/21 [00:00<00:01, 12.31it/s]


2025-07-09 03:52:46.090 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  38%|███▊      | 8/21 [00:00<00:01, 12.31it/s]


2025-07-09 03:52:46.161 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  43%|████▎     | 9/21 [00:00<00:00, 12.31it/s]


2025-07-09 03:52:46.206 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  43%|████▎     | 9/21 [00:00<00:00, 12.31it/s]


2025-07-09 03:52:46.206 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  43%|████▎     | 9/21 [00:00<00:00, 12.31it/s]

Running batch inference:  48%|████▊     | 10/21 [00:00<00:00, 11.63it/s]


2025-07-09 03:52:46.267 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  48%|████▊     | 10/21 [00:00<00:00, 11.63it/s]


2025-07-09 03:52:46.275 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  48%|████▊     | 10/21 [00:00<00:00, 11.63it/s]


2025-07-09 03:52:46.276 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  48%|████▊     | 10/21 [00:00<00:00, 11.63it/s]


2025-07-09 03:52:46.319 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  52%|█████▏    | 11/21 [00:00<00:00, 11.63it/s]


2025-07-09 03:52:46.357 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  52%|█████▏    | 11/21 [00:00<00:00, 11.63it/s]


2025-07-09 03:52:46.358 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  52%|█████▏    | 11/21 [00:00<00:00, 11.63it/s]

Running batch inference:  57%|█████▋    | 12/21 [00:00<00:00, 12.39it/s]


2025-07-09 03:52:46.408 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  57%|█████▋    | 12/21 [00:01<00:00, 12.39it/s]


2025-07-09 03:52:46.416 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  57%|█████▋    | 12/21 [00:01<00:00, 12.39it/s]


2025-07-09 03:52:46.416 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  57%|█████▋    | 12/21 [00:01<00:00, 12.39it/s]


2025-07-09 03:52:46.447 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  62%|██████▏   | 13/21 [00:01<00:00, 12.39it/s]


2025-07-09 03:52:46.493 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  62%|██████▏   | 13/21 [00:01<00:00, 12.39it/s]


2025-07-09 03:52:46.493 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  62%|██████▏   | 13/21 [00:01<00:00, 12.39it/s]

Running batch inference:  67%|██████▋   | 14/21 [00:01<00:00, 13.01it/s]


2025-07-09 03:52:46.539 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  67%|██████▋   | 14/21 [00:01<00:00, 13.01it/s]


2025-07-09 03:52:46.547 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  67%|██████▋   | 14/21 [00:01<00:00, 13.01it/s]


2025-07-09 03:52:46.548 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  67%|██████▋   | 14/21 [00:01<00:00, 13.01it/s]


2025-07-09 03:52:46.585 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  71%|███████▏  | 15/21 [00:01<00:00, 13.01it/s]


2025-07-09 03:52:46.631 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  71%|███████▏  | 15/21 [00:01<00:00, 13.01it/s]


2025-07-09 03:52:46.632 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  71%|███████▏  | 15/21 [00:01<00:00, 13.01it/s]

Running batch inference:  76%|███████▌  | 16/21 [00:01<00:00, 13.45it/s]


2025-07-09 03:52:46.667 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  76%|███████▌  | 16/21 [00:01<00:00, 13.45it/s]


2025-07-09 03:52:46.675 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  76%|███████▌  | 16/21 [00:01<00:00, 13.45it/s]


2025-07-09 03:52:46.676 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  76%|███████▌  | 16/21 [00:01<00:00, 13.45it/s]


2025-07-09 03:52:46.730 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  81%|████████  | 17/21 [00:01<00:00, 13.45it/s]


2025-07-09 03:52:46.773 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  81%|████████  | 17/21 [00:01<00:00, 13.45it/s]


2025-07-09 03:52:46.774 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  81%|████████  | 17/21 [00:01<00:00, 13.45it/s]

Running batch inference:  86%|████████▌ | 18/21 [00:01<00:00, 13.70it/s]


2025-07-09 03:52:46.808 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  86%|████████▌ | 18/21 [00:01<00:00, 13.70it/s]


2025-07-09 03:52:46.816 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  86%|████████▌ | 18/21 [00:01<00:00, 13.70it/s]


2025-07-09 03:52:46.817 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  86%|████████▌ | 18/21 [00:01<00:00, 13.70it/s]


2025-07-09 03:52:46.848 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  90%|█████████ | 19/21 [00:01<00:00, 13.70it/s]


2025-07-09 03:52:46.896 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  90%|█████████ | 19/21 [00:01<00:00, 13.70it/s]


2025-07-09 03:52:46.896 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  90%|█████████ | 19/21 [00:01<00:00, 13.70it/s]

Running batch inference:  95%|█████████▌| 20/21 [00:01<00:00, 14.35it/s]


2025-07-09 03:52:46.949 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  95%|█████████▌| 20/21 [00:01<00:00, 14.35it/s]


2025-07-09 03:52:46.957 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  95%|█████████▌| 20/21 [00:01<00:00, 14.35it/s]


2025-07-09 03:52:46.958 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  95%|█████████▌| 20/21 [00:01<00:00, 14.35it/s]

                                                                        2025-07-09 03:52:46.974 | DEBUG    | earth2studio.io.async_zarr:_limit_pool_size:482 - In IO thread pool throttle, limiting
2025-07-09 03:52:46.981 | DEBUG    | earth2studio.io.async_zarr:_limit_pool_size:482 - In IO thread pool throttle, limiting
2025-07-09 03:52:47.013 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays
2025-07-09 03:52:47.039 | DEBUG    | earth2studio.io.async_zarr:_limit_pool_size:482 - In IO thread pool throttle, limiting
2025-07-09 03:52:47.101 | DEBUG    | earth2studio.io.async_zarr:_limit_pool_size:482 - In IO thread pool throttle, limiting
2025-07-09 03:52:47.159 | DEBUG    | earth2studio.io.async_zarr:_limit_pool_size:482 - In IO thread pool throttle, limiting
2025-07-09 03:52:47.262 | DEBUG    | earth2studio.io.async_zarr:_limit_pool_size:482 - In IO thread pool throttle, limiting
print(
    f"\nNon-blocking async zarr store inference time: {zarr_nonblocking_async_clock}s"
)
print(
    f"Compressed non-blocking async zarr store size: {get_folder_size('outputs/17_io_nonblocking_async.zarr'):.2f} MB"
)
Non-blocking async zarr store inference time: 2.2544188499450684s
Compressed non-blocking async zarr store size: 413.08 MB

Remote Non-Blocking Async Zarr IO#

This IO backend can be further customized by changing the Fsspec Filesystem used by the Zarr store which can be controlled via the fs_factory parameter. Note that this is a factory method, the IO backend will need to create multiple instances of the file system. Some examples that may be of interest are:

  • from fsspec.implementations.local import LocalFileSystem (Default, local store)

  • from fsspec.implementations.memory import MemoryFileSystem (in-memory store)

  • from s3fs import S3FileSystem (Remote S3 store)

For sake of example, lets have a look at writing to a remote store would require. Compression is a must in this instances, since we need to minimize the data transfer over the network. The file system factory is set to S3 with the appropiate credentials in a partial callable object. Lastly we can increase the max number of thread workers with the pool_size parameter to further boost performance.

import functools

import s3fs

if "S3FS_KEY" in os.environ and "S3FS_SECRET" in os.environ:
    # Remember, needs to be a callable
    fs_factory = functools.partial(
        s3fs.S3FileSystem,
        key=os.environ["S3FS_KEY"],
        secret=os.environ["S3FS_SECRET"],
        client_kwargs={"endpoint_url": os.environ.get("S3FS_ENDPOINT", None)},
        asynchronous=True,
    )
    io = AsyncZarrBackend(
        "earth2studio/ci/example/17_io_async.zarr",
        parallel_coords=parallel_coords,
        fs_factory=fs_factory,
        blocking=False,
        pool_size=16,
        zarr_codecs=zarr.codecs.BloscCodec(
            cname="zstd", clevel=3, shuffle=zarr.codecs.BloscShuffle.shuffle
        ),
    )
    christmas_five_day_ensemble(times, 4, model, ds, io, pt, device=device)
    # IMPORTANT: Make sure to call close to ensure IO backend threads have finished!
    io.close()

    # To clean up the zarr store you can use
    # fs = s3fs.S3FileSystem(
    #     key=os.environ["S3FS_KEY"],
    #     secret=os.environ["S3FS_SECRET"],
    #     client_kwargs={"endpoint_url": os.environ.get("S3FS_ENDPOINT", None)},
    # )
    # fs.rm("earth2studio/ci/example/17_io_async.zarr", recursive=True)
2025-07-09 03:52:47.317 | DEBUG    | earth2studio.io.async_zarr:__init__:154 - Setting up Zarr object pool of size 16, may take a bit

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:53:52.361 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 399402626-989950

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:53:52.407 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 410101849-875378

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:53:52.431 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 295463736-857297

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:53:52.453 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 422682974-1171397

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:53:52.476 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 330483332-833835

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:53:52.498 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 251866913-804464

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:53:52.520 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 209450878-718645

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]
Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:01,  5.54it/s]
Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 38.76it/s]

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:53:52.550 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 410972364-870639

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:53:52.572 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 331696727-829753

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:53:52.594 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 296814817-852960

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:53:52.616 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 253345828-804505

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:53:52.638 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 400334303-985030

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:53:52.660 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 423575610-1170377

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]

2025-07-09 03:53:52.682 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 210327553-720496

Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]
Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  6.46it/s]
Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 45.18it/s]


Running batch inference:   0%|          | 0/5 [00:00<?, ?it/s]


2025-07-09 03:53:52.733 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:   0%|          | 0/5 [00:00<?, ?it/s]


2025-07-09 03:53:52.733 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:   0%|          | 0/5 [00:00<?, ?it/s]

Running batch inference:  20%|██        | 1/5 [00:07<00:29,  7.31s/it]


2025-07-09 03:54:00.111 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  20%|██        | 1/5 [00:07<00:29,  7.31s/it]


2025-07-09 03:54:00.167 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  20%|██        | 1/5 [00:07<00:29,  7.31s/it]


2025-07-09 03:54:00.168 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  20%|██        | 1/5 [00:07<00:29,  7.31s/it]

Running batch inference:  40%|████      | 2/5 [00:14<00:21,  7.31s/it]


2025-07-09 03:54:07.410 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  40%|████      | 2/5 [00:14<00:21,  7.31s/it]


2025-07-09 03:54:07.418 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  40%|████      | 2/5 [00:14<00:21,  7.31s/it]


2025-07-09 03:54:07.419 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  40%|████      | 2/5 [00:14<00:21,  7.31s/it]

Running batch inference:  60%|██████    | 3/5 [00:20<00:13,  6.57s/it]


2025-07-09 03:54:13.100 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  60%|██████    | 3/5 [00:20<00:13,  6.57s/it]


2025-07-09 03:54:13.155 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  60%|██████    | 3/5 [00:20<00:13,  6.57s/it]


2025-07-09 03:54:13.156 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  60%|██████    | 3/5 [00:20<00:13,  6.57s/it]

Running batch inference:  80%|████████  | 4/5 [00:23<00:05,  5.33s/it]


2025-07-09 03:54:16.531 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays


Running batch inference:  80%|████████  | 4/5 [00:23<00:05,  5.33s/it]


2025-07-09 03:54:16.539 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:372 - Datetime64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  80%|████████  | 4/5 [00:23<00:05,  5.33s/it]


2025-07-09 03:54:16.540 | WARNING  | earth2studio.io.async_zarr:_scrub_coordinates:378 - Timedelta64 not supported in zarr 3.0, converting to int64 nanoseconds since epoch


Running batch inference:  80%|████████  | 4/5 [00:23<00:05,  5.33s/it]

Running batch inference: 100%|██████████| 5/5 [00:28<00:00,  5.05s/it]

                                                                      2025-07-09 03:54:21.020 | DEBUG    | earth2studio.io.async_zarr:_limit_pool_size:482 - In IO thread pool throttle, limiting
2025-07-09 03:54:21.087 | DEBUG    | earth2studio.io.async_zarr:_write:616 - Writing 1 chunks to 2 Zarr arrays

Post-Processing#

Lastly, we can plot the each of the local Zarr stores to verify that indeed they are the same.

import matplotlib.pyplot as plt
import xarray as xr

# Load the datasets
ds_async = xr.open_zarr("outputs/17_io_async.zarr", consolidated=False)
ds_nonblocking = xr.open_zarr(
    "outputs/17_io_nonblocking_async.zarr", consolidated=False
)
ds_sync = xr.open_zarr("outputs/17_io_sync.zarr")
ds_nc = xr.open_dataset("outputs/17_io_sync.nc")

# Create a 2x2 subplot grid
fig, axes = plt.subplots(2, 2, figsize=(12, 8))
fig.suptitle("Comparison of mean t2m across IO Backends")

# Plot t2m from each dataset
axes[0, 0].imshow(
    ds_async.t2m.isel(time=0, lead_time=8).mean(dim="ensemble"), vmin=250, vmax=320
)
axes[0, 0].set_title("Async Zarr")

axes[0, 1].imshow(
    ds_nonblocking.t2m.isel(time=0, lead_time=8).mean(dim="ensemble"),
    vmin=250,
    vmax=320,
)
axes[0, 1].set_title("Non-blocking Async Zarr")

axes[1, 0].imshow(
    ds_sync.t2m.isel(time=0, lead_time=8).mean(dim="ensemble"), vmin=250, vmax=320
)
axes[1, 0].set_title("Sync Zarr")

axes[1, 1].imshow(
    ds_nc.t2m.isel(time=0, lead_time=8).mean(dim="ensemble"), vmin=250, vmax=320
)
axes[1, 1].set_title("NetCDF")

plt.tight_layout()
plt.savefig("outputs/17_io_performance.jpg", bbox_inches="tight")
Comparison of mean t2m across IO Backends, Async Zarr, Non-blocking Async Zarr, Sync Zarr, NetCDF

Total running time of the script: (2 minutes 43.900 seconds)

Gallery generated by Sphinx-Gallery