.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "examples/17_io_performance.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_examples_17_io_performance.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_examples_17_io_performance.py:


IO Backend Performance
========================

Leverage different IO backends for storing inference results.

This example explores IO backends inside Earth2Studio and how they can be used to write
data to different formats / locations. The IO is a core part of any inference pipeline
and depending on the desired target, can dramatically impact performance. This example
will help navigate users through the use of different IO backend APIs in a simple
workflow.

In this example you will learn:

- Initializing, creating arrays and writing with the Zarr IO backend
- Initializing, creating arrays and writing with the NetCDF IO backend
- Initializing and writing with the Asynchronous Non-blocking Zarr IO backend
- Discussing performance implications and strategies that can be used

.. GENERATED FROM PYTHON SOURCE LINES 37-44

.. code-block:: Python

    # /// script
    # dependencies = [
    #   "earth2studio[dlwp] @ git+https://github.com/NVIDIA/earth2studio.git",
    #   "matplotlib",
    # ]
    # ///


.. GENERATED FROM PYTHON SOURCE LINES 45-50

Set Up
------
To demonstrate different IO, this example will use a simple ensemble workflow that we
will manually create ourselves. One could use the built in workflow in Earth2Studio
however, this will allow us to better understand the APIs.

.. GENERATED FROM PYTHON SOURCE LINES 52-58

We need the following components:

- Datasource: Pull data from the GFS data api :py:class:`earth2studio.data.GFS`.
- Prognostic Model: Use the built in DLWP model :py:class:`earth2studio.models.px.DLWP`.
- Perturbation Method: Use the standard Gaussian method :py:class:`earth2studio.perturbation.Gaussian`.
- IO Backends: Use a few IO Backends including :py:class:`earth2studio.io.AsyncZarrBackend`, :py:class:`earth2studio.io.NetCDF4Backend` and :py:class:`earth2studio.io.ZarrBackend`.

.. GENERATED FROM PYTHON SOURCE LINES 60-89

.. code-block:: Python


    import os

    os.makedirs("outputs", exist_ok=True)
    from dotenv import load_dotenv

    load_dotenv()  # TODO: make common example prep function

    import torch

    from earth2studio.data import GFS, DataSource, fetch_data
    from earth2studio.io import AsyncZarrBackend, IOBackend, NetCDF4Backend, ZarrBackend
    from earth2studio.models.px import DLWP, PrognosticModel
    from earth2studio.perturbation import Gaussian, Perturbation

    # Get the device
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    # Load the cBottle data source
    package = DLWP.load_default_package()
    model = DLWP.load_model(package)
    model = model.to(device)

    # Create the ERA5 data source
    ds = GFS()

    # Create perturbation method
    pt = Gaussian()


.. GENERATED FROM PYTHON SOURCE LINES 90-97

Creating a Simple Ensemble Workflow
-----------------------------------
Start with creating a simple ensemble inference workflow. This is essentially a
simpler version of the built in ensemble workflow :py:meth:`earth2studio.run.ensemble`.
In this case, this is for an ensemble inference workflow that will predict a 5 day
forecast for Christmas 2022. Following standard Earth2Studio practices, the function
accepts initialized prognostic, data source, io backend and perturbation method.

.. GENERATED FROM PYTHON SOURCE LINES 99-200

.. code-block:: Python


    import os
    import time
    from datetime import datetime, timedelta

    import numpy as np
    from tqdm import tqdm

    from earth2studio.utils.coords import map_coords, split_coords
    from earth2studio.utils.time import to_time_array

    times = [datetime(2022, 12, 20)]
    nsteps = 20  # Assuming 6-hour time steps


    def christmas_five_day_ensemble(
        times: list[datetime],
        nsteps: int,
        prognostic: PrognosticModel,
        data: DataSource,
        io: IOBackend,
        perturbation: Perturbation,
        nensemble: int = 8,
        device: str = "cuda",
    ) -> None:
        """Ensemble inference example"""
        # ==========================================
        # Fetch Initialization Data
        prognostic_ic = prognostic.input_coords()
        times = to_time_array(times)

        x, coords0 = fetch_data(
            source=data,
            time=times,
            variable=prognostic_ic["variable"],
            lead_time=prognostic_ic["lead_time"],
            device=device,
        )
        # ==========================================
        # ==========================================
        # Set up IO backend by pre-allocating arrays (not needed for AsyncZarrBackend)
        total_coords = prognostic.output_coords(prognostic.input_coords()).copy()
        if "batch" in total_coords:
            del total_coords["batch"]
        total_coords["time"] = times
        total_coords["lead_time"] = np.asarray(
            [
                prognostic.output_coords(prognostic.input_coords())["lead_time"] * i
                for i in range(nsteps + 1)
            ]
        ).flatten()
        total_coords.move_to_end("lead_time", last=False)
        total_coords.move_to_end("time", last=False)
        total_coords = {"ensemble": np.arange(nensemble)} | total_coords

        variables_to_save = total_coords.pop("variable")
        io.add_array(total_coords, variables_to_save)
        # ==========================================
        # ==========================================
        # Run inference
        coords = {"ensemble": np.arange(nensemble)} | coords0.copy()
        x = x.unsqueeze(0).repeat(nensemble, *([1] * x.ndim))

        # Map lat and lon if needed
        x, coords = map_coords(x, coords, prognostic_ic)

        # Perturb ensemble
        x, coords = perturbation(x, coords)

        # Create prognostic iterator
        model = prognostic.create_iterator(x, coords)

        with tqdm(
            total=nsteps + 1,
            desc="Running batch inference",
            position=1,
            leave=False,
        ) as pbar:
            for step, (x, coords) in enumerate(model):
                # Dump result to IO, split_coords separates variables to different arrays
                x, coords = map_coords(x, coords, {"variable": np.array(["t2m", "tcwv"])})
                io.write(*split_coords(x, coords))
                pbar.update(1)
                if step == nsteps:
                    break
        # ==========================================


    def get_folder_size(folder_path: str) -> int:
        """Get folder size in megabytes"""
        if os.path.isfile(folder_path):
            return os.path.getsize(folder_path) / (1024 * 1024)

        total_size = 0
        for dirpath, dirnames, filenames in os.walk(folder_path):
            for filename in filenames:
                file_path = os.path.join(dirpath, filename)
                total_size += os.path.getsize(file_path)
        return total_size / (1024 * 1024)


.. GENERATED FROM PYTHON SOURCE LINES 201-209

Local Storage Zarr IO
---------------------
As a base line, lets run the Zarr IO backend saving it to local disk.
Local IO storage is typically preferred since we can then access the data after the
inference pipeline is finished using standard libraries.
Chunking play an important role on performance, both with respect to compression and
also when accessing data.
Here we will chunk the output data based on time and lead_time

.. GENERATED FROM PYTHON SOURCE LINES 211-222

.. code-block:: Python


    io = ZarrBackend(
        "outputs/17_io_sync.zarr",
        chunks={"time": 1, "lead_time": 1},
        backend_kwargs={"overwrite": True},
    )

    start_time = time.time()
    christmas_five_day_ensemble(times, nsteps, model, ds, io, pt, device=device)
    zarr_local_clock = time.time() - start_time


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:33.961 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 251866913-804464
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:33.963 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 399402626-989950
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:33.965 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 209450878-718645
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:33.967 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 295463736-857297
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:33.969 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 330483332-833835
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:33.971 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 422682974-1171397
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:33.972 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 410101849-875378
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]    Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:03,  1.72it/s]    Fetching GFS data:  29%|██▊       | 2/7 [00:00<00:01,  3.34it/s]    Fetching GFS data:  57%|█████▋    | 4/7 [00:00<00:00,  6.70it/s]    Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00,  8.30it/s]
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:35.258 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 410972364-870639
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:35.261 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 253345828-804505
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:35.262 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 210327553-720496
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:35.264 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 296814817-852960
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:35.266 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 331696727-829753
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:35.268 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 400334303-985030
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:35.270 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 423575610-1170377
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]    Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:04,  1.31it/s]    Fetching GFS data:  57%|█████▋    | 4/7 [00:00<00:00,  5.30it/s]    Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00,  7.55it/s]

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
    Running batch inference:   5%|▍         | 1/21 [00:00<00:13,  1.48it/s]
    Running batch inference:  10%|▉         | 2/21 [00:01<00:12,  1.50it/s]
    Running batch inference:  14%|█▍        | 3/21 [00:01<00:11,  1.50it/s]
    Running batch inference:  19%|█▉        | 4/21 [00:02<00:11,  1.54it/s]
    Running batch inference:  24%|██▍       | 5/21 [00:03<00:10,  1.53it/s]
    Running batch inference:  29%|██▊       | 6/21 [00:03<00:09,  1.54it/s]
    Running batch inference:  33%|███▎      | 7/21 [00:04<00:09,  1.54it/s]
    Running batch inference:  38%|███▊      | 8/21 [00:05<00:08,  1.51it/s]
    Running batch inference:  43%|████▎     | 9/21 [00:05<00:07,  1.54it/s]
    Running batch inference:  48%|████▊     | 10/21 [00:06<00:07,  1.53it/s]
    Running batch inference:  52%|█████▏    | 11/21 [00:07<00:06,  1.55it/s]
    Running batch inference:  57%|█████▋    | 12/21 [00:07<00:05,  1.54it/s]
    Running batch inference:  62%|██████▏   | 13/21 [00:08<00:05,  1.58it/s]
    Running batch inference:  67%|██████▋   | 14/21 [00:09<00:04,  1.56it/s]
    Running batch inference:  71%|███████▏  | 15/21 [00:09<00:03,  1.57it/s]
    Running batch inference:  76%|███████▌  | 16/21 [00:10<00:03,  1.58it/s]
    Running batch inference:  81%|████████  | 17/21 [00:11<00:02,  1.54it/s]
    Running batch inference:  86%|████████▌ | 18/21 [00:11<00:01,  1.53it/s]
    Running batch inference:  90%|█████████ | 19/21 [00:12<00:01,  1.54it/s]
    Running batch inference:  95%|█████████▌| 20/21 [00:12<00:00,  1.56it/s]
    Running batch inference: 100%|██████████| 21/21 [00:13<00:00,  1.55it/s]
                                                                            

.. GENERATED FROM PYTHON SOURCE LINES 223-229

.. code-block:: Python


    print(f"\nLocal zarr store inference time: {zarr_local_clock}s")
    print(
        f"Uncompressed zarr store size: {get_folder_size('outputs/17_io_sync.zarr'):.2f} MB"
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Local zarr store inference time: 16.341691732406616s
    Uncompressed zarr store size: 1330.78 MB


.. GENERATED FROM PYTHON SOURCE LINES 230-238

Compressed Local Storage Zarr IO
--------------------------------
By default the Zarr IO backends will be uncompressed.
In many instances this is fine, when data volumes are low.
However, in instances that we are writing a very large amount of data or the data
needs to get sent over the network to a remote store, compression is essential.
With the standard Zarr backend, this will cause a very noticeable slow down, but note
that the output store will be 3x smaller!

.. GENERATED FROM PYTHON SOURCE LINES 240-256

.. code-block:: Python


    import zarr

    io = ZarrBackend(
        "outputs/17_io_sync_compressed.zarr",
        chunks={"time": 1, "lead_time": 1},
        backend_kwargs={"overwrite": True},
        zarr_codecs=zarr.codecs.BloscCodec(
            cname="zstd", clevel=3, shuffle=zarr.codecs.BloscShuffle.shuffle
        ),  # Zarrs default
    )

    start_time = time.time()
    christmas_five_day_ensemble(times, nsteps, model, ds, io, pt, device=device)
    zarr_local_clock = time.time() - start_time


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:49.962 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 410101849-875378
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:49.990 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 295463736-857297
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:50.016 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 251866913-804464
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:50.036 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 209450878-718645
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:50.055 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 330483332-833835
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:50.076 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 399402626-989950
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:50.095 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 422682974-1171397
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]    Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  6.49it/s]    Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 45.37it/s]
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:50.124 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 423575610-1170377
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:50.143 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 253345828-804505
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:50.162 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 331696727-829753
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:50.182 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 400334303-985030
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:50.202 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 210327553-720496
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:50.221 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 296814817-852960
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:15:50.240 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 410972364-870639
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]    Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  7.37it/s]    Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 51.51it/s]

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
    Running batch inference:   5%|▍         | 1/21 [00:01<00:21,  1.06s/it]
    Running batch inference:  10%|▉         | 2/21 [00:02<00:20,  1.09s/it]
    Running batch inference:  14%|█▍        | 3/21 [00:03<00:20,  1.12s/it]
    Running batch inference:  19%|█▉        | 4/21 [00:04<00:19,  1.13s/it]
    Running batch inference:  24%|██▍       | 5/21 [00:05<00:18,  1.14s/it]
    Running batch inference:  29%|██▊       | 6/21 [00:06<00:17,  1.14s/it]
    Running batch inference:  33%|███▎      | 7/21 [00:07<00:16,  1.15s/it]
    Running batch inference:  38%|███▊      | 8/21 [00:09<00:14,  1.14s/it]
    Running batch inference:  43%|████▎     | 9/21 [00:10<00:13,  1.14s/it]
    Running batch inference:  48%|████▊     | 10/21 [00:11<00:12,  1.14s/it]
    Running batch inference:  52%|█████▏    | 11/21 [00:12<00:11,  1.13s/it]
    Running batch inference:  57%|█████▋    | 12/21 [00:13<00:10,  1.13s/it]
    Running batch inference:  62%|██████▏   | 13/21 [00:14<00:08,  1.12s/it]
    Running batch inference:  67%|██████▋   | 14/21 [00:15<00:07,  1.12s/it]
    Running batch inference:  71%|███████▏  | 15/21 [00:16<00:06,  1.12s/it]
    Running batch inference:  76%|███████▌  | 16/21 [00:18<00:05,  1.12s/it]
    Running batch inference:  81%|████████  | 17/21 [00:19<00:04,  1.12s/it]
    Running batch inference:  86%|████████▌ | 18/21 [00:20<00:03,  1.13s/it]
    Running batch inference:  90%|█████████ | 19/21 [00:21<00:02,  1.11s/it]
    Running batch inference:  95%|█████████▌| 20/21 [00:22<00:01,  1.12s/it]
    Running batch inference: 100%|██████████| 21/21 [00:23<00:00,  1.12s/it]
                                                                            

.. GENERATED FROM PYTHON SOURCE LINES 257-264

.. code-block:: Python


    print(f"\nLocal compressed zarr store inference time: {zarr_local_clock}s")
    print(
        f"Compressed zarr store size: {get_folder_size('outputs/17_io_sync_compressed.zarr'):.2f} MB"
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Local compressed zarr store inference time: 24.032443046569824s
    Compressed zarr store size: 413.09 MB


.. GENERATED FROM PYTHON SOURCE LINES 265-271

Local Storage NetCDF IO
-----------------------
NetCDF offers a similar user experience but saves the output into a single netCDF
file.
For local storage, NetCDF it typically preferred since it keeps all outputs into
a single file.

.. GENERATED FROM PYTHON SOURCE LINES 273-279

.. code-block:: Python


    io = NetCDF4Backend("outputs/17_io_sync.nc", backend_kwargs={"mode": "w"})
    start_time = time.time()
    christmas_five_day_ensemble(times, nsteps, model, ds, io, pt, device=device)
    nc_local_clock = time.time() - start_time


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:13.997 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 295463736-857297
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:14.016 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 209450878-718645
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:14.036 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 251866913-804464
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:14.056 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 330483332-833835
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:14.075 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 422682974-1171397
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:14.095 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 410101849-875378
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:14.114 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 399402626-989950
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]    Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  7.32it/s]    Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 51.15it/s]
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:14.141 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 253345828-804505
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:14.161 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 210327553-720496
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:14.180 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 296814817-852960
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:14.199 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 400334303-985030
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:14.219 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 423575610-1170377
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:14.238 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 410972364-870639
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:14.257 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 331696727-829753
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]    Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  7.39it/s]    Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 51.69it/s]

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
    Running batch inference:   5%|▍         | 1/21 [00:00<00:15,  1.29it/s]
    Running batch inference:  19%|█▉        | 4/21 [00:00<00:03,  5.58it/s]
    Running batch inference:  38%|███▊      | 8/21 [00:01<00:01, 10.93it/s]
    Running batch inference:  57%|█████▋    | 12/21 [00:01<00:00, 15.34it/s]
    Running batch inference:  76%|███████▌  | 16/21 [00:01<00:00, 18.84it/s]
    Running batch inference:  95%|█████████▌| 20/21 [00:01<00:00, 21.53it/s]
                                                                            

.. GENERATED FROM PYTHON SOURCE LINES 280-286

.. code-block:: Python


    print(f"\nLocal netcdf store inference time: {nc_local_clock}s")
    print(
        f"Uncompressed zarr store size: {get_folder_size('outputs/17_io_sync.nc'):.2f} MB"
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Local netcdf store inference time: 1.798389196395874s
    Uncompressed zarr store size: 1330.79 MB


.. GENERATED FROM PYTHON SOURCE LINES 287-293

In Memory Zarr IO
-----------------
One way we can speed up IO is to save outputs to in-memory stores.
In-memory stores more limited in size depending on the hardware being used.
Also one needs to be careful with in memory stores, once the Python object is deleted
the data is gone.

.. GENERATED FROM PYTHON SOURCE LINES 295-303

.. code-block:: Python


    io = ZarrBackend(
        chunks={"time": 1, "lead_time": 1}, backend_kwargs={"overwrite": True}
    )  # Not path = in memory for Zarr
    start_time = time.time()
    christmas_five_day_ensemble(times, nsteps, model, ds, io, pt, device=device)
    zarr_memory_clock = time.time() - start_time


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:15.796 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 330483332-833835
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:15.816 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 295463736-857297
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:15.836 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 399402626-989950
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:15.856 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 410101849-875378
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:15.876 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 209450878-718645
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:15.895 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 251866913-804464
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:15.915 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 422682974-1171397
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]    Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  7.22it/s]    Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 50.48it/s]
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:15.943 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 331696727-829753
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:15.962 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 410972364-870639
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:15.981 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 400334303-985030
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:16.000 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 423575610-1170377
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:16.020 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 253345828-804505
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:16.039 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 210327553-720496
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:16.059 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 296814817-852960
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]    Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  7.35it/s]    Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 51.39it/s]

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
    Running batch inference:   5%|▍         | 1/21 [00:00<00:10,  1.91it/s]
    Running batch inference:  10%|▉         | 2/21 [00:01<00:09,  1.90it/s]
    Running batch inference:  14%|█▍        | 3/21 [00:01<00:09,  1.94it/s]
    Running batch inference:  19%|█▉        | 4/21 [00:02<00:08,  1.93it/s]
    Running batch inference:  24%|██▍       | 5/21 [00:02<00:08,  1.92it/s]
    Running batch inference:  29%|██▊       | 6/21 [00:03<00:07,  1.91it/s]
    Running batch inference:  33%|███▎      | 7/21 [00:03<00:07,  1.92it/s]
    Running batch inference:  38%|███▊      | 8/21 [00:04<00:06,  1.91it/s]
    Running batch inference:  43%|████▎     | 9/21 [00:04<00:06,  1.91it/s]
    Running batch inference:  48%|████▊     | 10/21 [00:05<00:05,  1.91it/s]
    Running batch inference:  52%|█████▏    | 11/21 [00:05<00:05,  1.93it/s]
    Running batch inference:  57%|█████▋    | 12/21 [00:06<00:04,  1.92it/s]
    Running batch inference:  62%|██████▏   | 13/21 [00:06<00:04,  1.91it/s]
    Running batch inference:  67%|██████▋   | 14/21 [00:07<00:03,  1.91it/s]
    Running batch inference:  71%|███████▏  | 15/21 [00:07<00:03,  1.93it/s]
    Running batch inference:  76%|███████▌  | 16/21 [00:08<00:02,  1.92it/s]
    Running batch inference:  81%|████████  | 17/21 [00:08<00:02,  1.92it/s]
    Running batch inference:  86%|████████▌ | 18/21 [00:09<00:01,  1.91it/s]
    Running batch inference:  90%|█████████ | 19/21 [00:09<00:01,  1.92it/s]
    Running batch inference:  95%|█████████▌| 20/21 [00:10<00:00,  1.92it/s]
    Running batch inference: 100%|██████████| 21/21 [00:10<00:00,  1.93it/s]
                                                                            

.. GENERATED FROM PYTHON SOURCE LINES 304-307

.. code-block:: Python


    print(f"\nIn memory zarr store inference time: {zarr_memory_clock}s")


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    In memory zarr store inference time: 11.282497882843018s


.. GENERATED FROM PYTHON SOURCE LINES 308-321

Compressed Local Async Zarr IO
------------------------------
The async Zarr IO backend is an advanced IO backend designed to offer async
Zarr 3.0 writes to in-memory, local and remote data stores.
This data source is ideal when large volumes of data are needed to be written and
the users want to mask the IO with the forward execution of the model.

Because this IO backend relies on both async and multi-threading, it has a different
initialization pattern than others.
The main difference being that this backend does not use the add_array API, rather
users specify `parallel_coords` in the constructor that denote coords that slices will
be written to during inference.
Typically this might be `time`, `lead_time` and `ensemble`.

.. GENERATED FROM PYTHON SOURCE LINES 323-341

.. code-block:: Python


    parallel_coords = {
        "time": np.array(times, dtype=np.datetime64),
        "lead_time": np.array(
            [timedelta(hours=6 * i) for i in range(nsteps + 1)], dtype=np.timedelta64
        ),
    }
    io = AsyncZarrBackend(
        "outputs/17_io_async.zarr",
        parallel_coords=parallel_coords,
        zarr_codecs=zarr.codecs.BloscCodec(
            cname="zstd", clevel=3, shuffle=zarr.codecs.BloscShuffle.shuffle
        ),
    )
    start_time = time.time()
    christmas_five_day_ensemble(times, nsteps, model, ds, io, pt, device=device)
    zarr_async_clock = time.time() - start_time


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    2025-11-12 19:16:27.074 | DEBUG    | earth2studio.io.async_zarr:__init__:145 - Setting up Zarr object pool of size 1, may take a bit
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:27.102 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 251866913-804464
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:27.121 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 410101849-875378
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:27.141 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 330483332-833835
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:27.161 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 399402626-989950
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:27.180 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 422682974-1171397
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:27.200 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 295463736-857297
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:27.220 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 209450878-718645
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]    Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  7.24it/s]    Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 50.64it/s]
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:27.247 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 296814817-852960
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:27.267 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 423575610-1170377
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:27.286 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 210327553-720496
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:27.305 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 400334303-985030
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:27.325 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 410972364-870639
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:27.344 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 331696727-829753
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:27.363 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 253345828-804505
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]    Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  7.39it/s]    Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 51.68it/s]

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
                                                                   2025-11-12 19:16:27.421 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:289 - Writing coordinate array ensemble to zarr store

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
                                                                   2025-11-12 19:16:27.426 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:289 - Writing coordinate array time to zarr store

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
                                                                   2025-11-12 19:16:27.431 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:289 - Writing coordinate array lead_time to zarr store

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
                                                                   2025-11-12 19:16:27.436 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:289 - Writing coordinate array lat to zarr store

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
                                                                   2025-11-12 19:16:27.440 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:289 - Writing coordinate array lon to zarr store

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
                                                                   2025-11-12 19:16:27.444 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:323 - Initializing array t2m with shape (8, 1, 21, 721, 1440) with chunks (8, 1, 1, 721, 1440) dtype <class 'numpy.float32'>

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
                                                                   2025-11-12 19:16:27.448 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:323 - Initializing array tcwv with shape (8, 1, 21, 721, 1440) with chunks (8, 1, 1, 721, 1440) dtype <class 'numpy.float32'>

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
                                                                   2025-11-12 19:16:27.490 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
    Running batch inference:   5%|▍         | 1/21 [00:00<00:08,  2.44it/s]
                                                                           2025-11-12 19:16:27.892 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:   5%|▍         | 1/21 [00:00<00:08,  2.44it/s]
    Running batch inference:  10%|▉         | 2/21 [00:00<00:07,  2.64it/s]
                                                                           2025-11-12 19:16:28.263 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  10%|▉         | 2/21 [00:00<00:07,  2.64it/s]
    Running batch inference:  14%|█▍        | 3/21 [00:01<00:07,  2.56it/s]
                                                                           2025-11-12 19:16:28.712 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  14%|█▍        | 3/21 [00:01<00:07,  2.56it/s]
    Running batch inference:  19%|█▉        | 4/21 [00:01<00:07,  2.38it/s]
                                                                           2025-11-12 19:16:29.133 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  19%|█▉        | 4/21 [00:01<00:07,  2.38it/s]
    Running batch inference:  24%|██▍       | 5/21 [00:02<00:06,  2.39it/s]
                                                                           2025-11-12 19:16:29.550 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  24%|██▍       | 5/21 [00:02<00:06,  2.39it/s]
    Running batch inference:  29%|██▊       | 6/21 [00:02<00:06,  2.41it/s]
                                                                           2025-11-12 19:16:29.932 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  29%|██▊       | 6/21 [00:02<00:06,  2.41it/s]
    Running batch inference:  33%|███▎      | 7/21 [00:02<00:05,  2.53it/s]
                                                                           2025-11-12 19:16:30.344 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  33%|███▎      | 7/21 [00:02<00:05,  2.53it/s]
    Running batch inference:  38%|███▊      | 8/21 [00:03<00:05,  2.41it/s]
                                                                           2025-11-12 19:16:30.747 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  38%|███▊      | 8/21 [00:03<00:05,  2.41it/s]
    Running batch inference:  43%|████▎     | 9/21 [00:03<00:04,  2.49it/s]
                                                                           2025-11-12 19:16:31.165 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  43%|████▎     | 9/21 [00:03<00:04,  2.49it/s]
    Running batch inference:  48%|████▊     | 10/21 [00:04<00:04,  2.40it/s]
                                                                            2025-11-12 19:16:31.580 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  48%|████▊     | 10/21 [00:04<00:04,  2.40it/s]
    Running batch inference:  52%|█████▏    | 11/21 [00:04<00:04,  2.44it/s]
                                                                            2025-11-12 19:16:32.006 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  52%|█████▏    | 11/21 [00:04<00:04,  2.44it/s]
    Running batch inference:  57%|█████▋    | 12/21 [00:04<00:03,  2.40it/s]
                                                                            2025-11-12 19:16:32.409 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  57%|█████▋    | 12/21 [00:04<00:03,  2.40it/s]
    Running batch inference:  62%|██████▏   | 13/21 [00:05<00:03,  2.44it/s]
                                                                            2025-11-12 19:16:32.816 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  62%|██████▏   | 13/21 [00:05<00:03,  2.44it/s]
    Running batch inference:  67%|██████▋   | 14/21 [00:05<00:02,  2.43it/s]
                                                                            2025-11-12 19:16:33.210 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  67%|██████▋   | 14/21 [00:05<00:02,  2.43it/s]
    Running batch inference:  71%|███████▏  | 15/21 [00:06<00:02,  2.45it/s]
                                                                            2025-11-12 19:16:33.631 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  71%|███████▏  | 15/21 [00:06<00:02,  2.45it/s]
    Running batch inference:  76%|███████▌  | 16/21 [00:06<00:02,  2.50it/s]
                                                                            2025-11-12 19:16:33.986 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  76%|███████▌  | 16/21 [00:06<00:02,  2.50it/s]
    Running batch inference:  81%|████████  | 17/21 [00:06<00:01,  2.50it/s]
                                                                            2025-11-12 19:16:34.457 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  81%|████████  | 17/21 [00:07<00:01,  2.50it/s]
    Running batch inference:  86%|████████▌ | 18/21 [00:07<00:01,  2.40it/s]
                                                                            2025-11-12 19:16:34.865 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  86%|████████▌ | 18/21 [00:07<00:01,  2.40it/s]
    Running batch inference:  90%|█████████ | 19/21 [00:07<00:00,  2.41it/s]
                                                                            2025-11-12 19:16:35.342 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  90%|█████████ | 19/21 [00:07<00:00,  2.41it/s]
    Running batch inference:  95%|█████████▌| 20/21 [00:08<00:00,  2.33it/s]
                                                                            2025-11-12 19:16:35.734 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  95%|█████████▌| 20/21 [00:08<00:00,  2.33it/s]
    Running batch inference: 100%|██████████| 21/21 [00:08<00:00,  2.37it/s]
                                                                            

.. GENERATED FROM PYTHON SOURCE LINES 342-348

.. code-block:: Python


    print(f"\nAsync zarr store inference time: {zarr_async_clock}s")
    print(
        f"Compressed async zarr store size: {get_folder_size('outputs/17_io_async.zarr'):.2f} MB"
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Async zarr store inference time: 8.969738960266113s
    Compressed async zarr store size: 413.07 MB


.. GENERATED FROM PYTHON SOURCE LINES 349-361

Compressed Local Non-Blocking Async Zarr IO
-------------------------------------------
That was faster than the normal Zarr method, even the uncompressed version making it
comparable to NetCDF, but we can still improve with this IO backend.
A unique feature of this particular backend is running in non-blocking mode, namely
IO writes will be placed onto other threads.
Users do need to be careful with this to both ensure data is not mutated while the IO
backend is working to move the data off the GPU, but also to make sure to wait for
write threads to finish before the object is deleted.

Note that this backend allows Zarr to be comparable to uncompressed NetCDF even 3x
compression!

.. GENERATED FROM PYTHON SOURCE LINES 363-378

.. code-block:: Python


    io = AsyncZarrBackend(
        "outputs/17_io_nonblocking_async.zarr",
        parallel_coords=parallel_coords,
        blocking=False,
        zarr_codecs=zarr.codecs.BloscCodec(
            cname="zstd", clevel=3, shuffle=zarr.codecs.BloscShuffle.shuffle
        ),
    )
    start_time = time.time()
    christmas_five_day_ensemble(times, nsteps, model, ds, io, pt, device=device)
    # IMPORTANT: Make sure to call close to ensure IO backend threads have finished!
    io.close()
    zarr_nonblocking_async_clock = time.time() - start_time


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    2025-11-12 19:16:36.075 | DEBUG    | earth2studio.io.async_zarr:__init__:145 - Setting up Zarr object pool of size 8, may take a bit
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:36.102 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 209450878-718645
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:36.125 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 422682974-1171397
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:36.145 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 330483332-833835
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:36.165 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 251866913-804464
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:36.185 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 295463736-857297
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:36.205 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 399402626-989950
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:36.225 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 410101849-875378
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]    Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  7.04it/s]    Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 49.17it/s]
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:36.252 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 253345828-804505
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:36.272 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 423575610-1170377
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:36.291 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 400334303-985030
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:36.310 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 331696727-829753
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:36.329 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 410972364-870639
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:36.349 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 210327553-720496
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:16:36.368 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 296814817-852960
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]    Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  7.37it/s]    Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 51.51it/s]

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
                                                                   2025-11-12 19:16:36.426 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:289 - Writing coordinate array ensemble to zarr store

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
                                                                   2025-11-12 19:16:36.430 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:289 - Writing coordinate array time to zarr store

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
                                                                   2025-11-12 19:16:36.434 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:289 - Writing coordinate array lead_time to zarr store

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
                                                                   2025-11-12 19:16:36.438 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:289 - Writing coordinate array lat to zarr store

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
                                                                   2025-11-12 19:16:36.441 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:289 - Writing coordinate array lon to zarr store

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
                                                                   2025-11-12 19:16:36.445 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:323 - Initializing array t2m with shape (8, 1, 21, 721, 1440) with chunks (8, 1, 1, 721, 1440) dtype <class 'numpy.float32'>

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
                                                                   2025-11-12 19:16:36.447 | DEBUG    | earth2studio.io.async_zarr:_initialize_arrays:323 - Initializing array tcwv with shape (8, 1, 21, 721, 1440) with chunks (8, 1, 1, 721, 1440) dtype <class 'numpy.float32'>

    Running batch inference:   0%|          | 0/21 [00:00<?, ?it/s]
                                                                   2025-11-12 19:16:36.504 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:   5%|▍         | 1/21 [00:00<00:01, 12.02it/s]
    Running batch inference:  10%|▉         | 2/21 [00:00<00:01, 14.27it/s]
                                                                           2025-11-12 19:16:36.615 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  10%|▉         | 2/21 [00:00<00:01, 14.27it/s]
                                                                           2025-11-12 19:16:36.695 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  14%|█▍        | 3/21 [00:00<00:01, 14.27it/s]
    Running batch inference:  19%|█▉        | 4/21 [00:00<00:01, 12.35it/s]
                                                                           2025-11-12 19:16:36.793 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  19%|█▉        | 4/21 [00:00<00:01, 12.35it/s]
                                                                           2025-11-12 19:16:36.858 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  24%|██▍       | 5/21 [00:00<00:01, 12.35it/s]
    Running batch inference:  29%|██▊       | 6/21 [00:00<00:01, 11.46it/s]
                                                                           2025-11-12 19:16:36.965 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  29%|██▊       | 6/21 [00:00<00:01, 11.46it/s]
                                                                           2025-11-12 19:16:37.024 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  33%|███▎      | 7/21 [00:00<00:01, 11.46it/s]
    Running batch inference:  38%|███▊      | 8/21 [00:00<00:01, 12.44it/s]
                                                                           2025-11-12 19:16:37.112 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  38%|███▊      | 8/21 [00:00<00:01, 12.44it/s]
                                                                           2025-11-12 19:16:37.164 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  43%|████▎     | 9/21 [00:00<00:00, 12.44it/s]
    Running batch inference:  48%|████▊     | 10/21 [00:00<00:00, 12.64it/s]
                                                                            2025-11-12 19:16:37.245 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  48%|████▊     | 10/21 [00:00<00:00, 12.64it/s]
                                                                            2025-11-12 19:16:37.283 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  52%|█████▏    | 11/21 [00:00<00:00, 12.64it/s]
    Running batch inference:  57%|█████▋    | 12/21 [00:00<00:00, 13.97it/s]
                                                                            2025-11-12 19:16:37.359 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  57%|█████▋    | 12/21 [00:00<00:00, 13.97it/s]
                                                                            2025-11-12 19:16:37.400 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  62%|██████▏   | 13/21 [00:00<00:00, 13.97it/s]
    Running batch inference:  67%|██████▋   | 14/21 [00:01<00:00, 14.90it/s]
                                                                            2025-11-12 19:16:37.501 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  67%|██████▋   | 14/21 [00:01<00:00, 14.90it/s]
                                                                            2025-11-12 19:16:37.538 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  71%|███████▏  | 15/21 [00:01<00:00, 14.90it/s]
    Running batch inference:  76%|███████▌  | 16/21 [00:01<00:00, 14.85it/s]
                                                                            2025-11-12 19:16:37.626 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  76%|███████▌  | 16/21 [00:01<00:00, 14.85it/s]
                                                                            2025-11-12 19:16:37.684 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  81%|████████  | 17/21 [00:01<00:00, 14.85it/s]
    Running batch inference:  86%|████████▌ | 18/21 [00:01<00:00, 14.31it/s]
                                                                            2025-11-12 19:16:37.796 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  86%|████████▌ | 18/21 [00:01<00:00, 14.31it/s]
                                                                            2025-11-12 19:16:37.853 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  90%|█████████ | 19/21 [00:01<00:00, 14.31it/s]
    Running batch inference:  95%|█████████▌| 20/21 [00:01<00:00, 13.60it/s]
                                                                            2025-11-12 19:16:37.944 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  95%|█████████▌| 20/21 [00:01<00:00, 13.60it/s]
                                                                            2025-11-12 19:16:37.954 | DEBUG    | earth2studio.io.async_zarr:_limit_pool_size:458 - In IO thread pool throttle, limiting 
    2025-11-12 19:16:37.993 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays
    2025-11-12 19:16:38.015 | DEBUG    | earth2studio.io.async_zarr:_limit_pool_size:458 - In IO thread pool throttle, limiting 
    2025-11-12 19:16:38.112 | DEBUG    | earth2studio.io.async_zarr:_limit_pool_size:458 - In IO thread pool throttle, limiting 
    2025-11-12 19:16:38.167 | DEBUG    | earth2studio.io.async_zarr:_limit_pool_size:458 - In IO thread pool throttle, limiting 
    2025-11-12 19:16:38.230 | DEBUG    | earth2studio.io.async_zarr:_limit_pool_size:458 - In IO thread pool throttle, limiting 


.. GENERATED FROM PYTHON SOURCE LINES 379-387

.. code-block:: Python


    print(
        f"\nNon-blocking async zarr store inference time: {zarr_nonblocking_async_clock}s"
    )
    print(
        f"Compressed non-blocking async zarr store size: {get_folder_size('outputs/17_io_nonblocking_async.zarr'):.2f} MB"
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Non-blocking async zarr store inference time: 2.1879584789276123s
    Compressed non-blocking async zarr store size: 413.08 MB


.. GENERATED FROM PYTHON SOURCE LINES 388-407

Remote Non-Blocking Async Zarr IO
----------------------------------
This IO backend can be further customized by changing the Fsspec Filesystem used by
the Zarr store which can be controlled via the `fs_factory` parameter.
Note that this is a factory method, the IO backend will need to create multiple
instances of the file system.
Some examples that may be of interest are:

- :code:`from fsspec.implementations.local import LocalFileSystem` (Default, local store)
- :code:`from fsspec.implementations.memory import MemoryFileSystem` (in-memory store)
- :code:`from s3fs import S3FileSystem` (Remote S3 store)

For sake of example, lets have a look at writing to a remote store would require.
Compression is a must in this instances, since we need to minimize the data transfer
over the network.
The file system factory is set to S3 with the appropiate credentials in a partial
callable object.
Lastly we can increase the max number of thread workers with the `pool_size` parameter
to further boost performance.

.. GENERATED FROM PYTHON SOURCE LINES 409-445

.. code-block:: Python


    import functools

    import s3fs

    if "S3FS_KEY" in os.environ and "S3FS_SECRET" in os.environ:
        # Remember, needs to be a callable
        fs_factory = functools.partial(
            s3fs.S3FileSystem,
            key=os.environ["S3FS_KEY"],
            secret=os.environ["S3FS_SECRET"],
            client_kwargs={"endpoint_url": os.environ.get("S3FS_ENDPOINT", None)},
            asynchronous=True,
        )
        io = AsyncZarrBackend(
            "earth2studio/ci/example/17_io_async.zarr",
            parallel_coords=parallel_coords,
            fs_factory=fs_factory,
            blocking=False,
            pool_size=16,
            zarr_codecs=zarr.codecs.BloscCodec(
                cname="zstd", clevel=3, shuffle=zarr.codecs.BloscShuffle.shuffle
            ),
        )
        christmas_five_day_ensemble(times, 4, model, ds, io, pt, device=device)
        # IMPORTANT: Make sure to call close to ensure IO backend threads have finished!
        io.close()

        # To clean up the zarr store you can use
        # fs = s3fs.S3FileSystem(
        #     key=os.environ["S3FS_KEY"],
        #     secret=os.environ["S3FS_SECRET"],
        #     client_kwargs={"endpoint_url": os.environ.get("S3FS_ENDPOINT", None)},
        # )
        # fs.rm("earth2studio/ci/example/17_io_async.zarr", recursive=True)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    2025-11-12 19:16:38.294 | DEBUG    | earth2studio.io.async_zarr:__init__:145 - Setting up Zarr object pool of size 16, may take a bit
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:17:05.388 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 399402626-989950
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:17:05.432 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 295463736-857297
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:17:05.462 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 422682974-1171397
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:17:05.482 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 209450878-718645
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:17:05.501 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 251866913-804464
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:17:05.521 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 330483332-833835
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:17:05.540 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221219/18/atmos/gfs.t18z.pgrb2.0p25.f000 410101849-875378
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]    Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:01,  5.79it/s]    Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 40.48it/s]
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:17:05.568 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 400334303-985030
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:17:05.588 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 331696727-829753
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:17:05.607 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 296814817-852960
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:17:05.626 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 253345828-804505
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:17:05.645 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 423575610-1170377
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:17:05.664 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 410972364-870639
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]                                                            2025-11-12 19:17:05.683 | DEBUG    | earth2studio.data.gfs:fetch_array:380 - Fetching GFS grib file: noaa-gfs-bdp-pds/gfs.20221220/00/atmos/gfs.t00z.pgrb2.0p25.f000 210327553-720496
    Fetching GFS data:   0%|          | 0/7 [00:00<?, ?it/s]    Fetching GFS data:  14%|█▍        | 1/7 [00:00<00:00,  7.44it/s]    Fetching GFS data: 100%|██████████| 7/7 [00:00<00:00, 51.99it/s]

    Running batch inference:   0%|          | 0/5 [00:00<?, ?it/s]
    Running batch inference:  20%|██        | 1/5 [00:03<00:12,  3.20s/it]
                                                                          2025-11-12 19:17:09.004 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  20%|██        | 1/5 [00:03<00:12,  3.20s/it]
    Running batch inference:  40%|████      | 2/5 [00:07<00:11,  3.72s/it]
                                                                          2025-11-12 19:17:13.082 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  40%|████      | 2/5 [00:07<00:11,  3.72s/it]
    Running batch inference:  60%|██████    | 3/5 [00:09<00:05,  2.98s/it]
                                                                          2025-11-12 19:17:15.190 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  60%|██████    | 3/5 [00:09<00:05,  2.98s/it]
    Running batch inference:  80%|████████  | 4/5 [00:13<00:03,  3.24s/it]
                                                                          2025-11-12 19:17:18.823 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays

    Running batch inference:  80%|████████  | 4/5 [00:13<00:03,  3.24s/it]
    Running batch inference: 100%|██████████| 5/5 [00:16<00:00,  3.50s/it]
                                                                          2025-11-12 19:17:22.718 | DEBUG    | earth2studio.io.async_zarr:_limit_pool_size:458 - In IO thread pool throttle, limiting 
    2025-11-12 19:17:22.789 | DEBUG    | earth2studio.io.async_zarr:_write:592 - Writing 1 chunks to 2 Zarr arrays


.. GENERATED FROM PYTHON SOURCE LINES 446-450

Post-Processing
---------------
Lastly, we can plot the each of the local Zarr stores to verify that indeed they are
the same.

.. GENERATED FROM PYTHON SOURCE LINES 452-492

.. code-block:: Python

    import matplotlib.pyplot as plt
    import xarray as xr

    # Load the datasets
    ds_async = xr.open_zarr("outputs/17_io_async.zarr", consolidated=False)
    ds_nonblocking = xr.open_zarr(
        "outputs/17_io_nonblocking_async.zarr", consolidated=False
    )
    ds_sync = xr.open_zarr("outputs/17_io_sync.zarr")
    ds_nc = xr.open_dataset("outputs/17_io_sync.nc")

    # Create a 2x2 subplot grid
    fig, axes = plt.subplots(2, 2, figsize=(12, 8))
    fig.suptitle("Comparison of mean t2m across IO Backends")

    # Plot t2m from each dataset
    axes[0, 0].imshow(
        ds_async.t2m.isel(time=0, lead_time=8).mean(dim="ensemble"), vmin=250, vmax=320
    )
    axes[0, 0].set_title("Async Zarr")

    axes[0, 1].imshow(
        ds_nonblocking.t2m.isel(time=0, lead_time=8).mean(dim="ensemble"),
        vmin=250,
        vmax=320,
    )
    axes[0, 1].set_title("Non-blocking Async Zarr")

    axes[1, 0].imshow(
        ds_sync.t2m.isel(time=0, lead_time=8).mean(dim="ensemble"), vmin=250, vmax=320
    )
    axes[1, 0].set_title("Sync Zarr")

    axes[1, 1].imshow(
        ds_nc.t2m.isel(time=0, lead_time=8).mean(dim="ensemble"), vmin=250, vmax=320
    )
    axes[1, 1].set_title("NetCDF")

    plt.tight_layout()
    plt.savefig("outputs/17_io_performance.jpg", bbox_inches="tight")


.. image-sg:: /examples/images/sphx_glr_17_io_performance_001.png
   :alt: Comparison of mean t2m across IO Backends, Async Zarr, Non-blocking Async Zarr, Sync Zarr, NetCDF
   :srcset: /examples/images/sphx_glr_17_io_performance_001.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (1 minutes 54.996 seconds)


.. _sphx_glr_download_examples_17_io_performance.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: 17_io_performance.ipynb <17_io_performance.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: 17_io_performance.py <17_io_performance.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: 17_io_performance.zip <17_io_performance.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_