Higher-Level Libraries

The MSC adapters for higher-level libraries use shortcuts under the hood.

fsspec

multistorageclient.async_fs aliases the multistorageclient.contrib.async_fs module.

This module provides the multistorageclient.contrib.async_fs.MultiAsyncFileSystem class which implements fsspec’s AsyncFileSystem class.

Note

The msc:// protocol is automatically registered when pip install multi-storage-client is run.

 1import multistorageclient as msc
 2
 3# Create an MSC-based AsyncFileSystem instance.
 4fs = msc.async_fs.MultiAsyncFileSystem()
 5
 6# Create a client for the data-s3-iad profile and open a file.
 7file = fs.open("msc://data-s3-iad/animal-photos/red-panda.png")
 8
 9# Reuse the client for the data-s3-iad profile and download a file.
10fs.get_file(
11    rpath="msc://data-s3-iad/animal-photos/red-panda.png",
12    lpath="/tmp/animal-photos/red-panda.png"
13)

NumPy

multistorageclient.numpy aliases the multistorageclient.contrib.numpy module.

This module provides load, memmap, and save methods for loading and saving NumPy arrays.

 1import multistorageclient as msc
 2import numpy
 3
 4# Create a client for the data-s3-iad profile and load an array.
 5array = msc.numpy.load("msc://data-s3-iad/numpy-arrays/ndarray-1.npz")
 6
 7# Reuse the client for the data-s3-iad profile and load a memory-mapped array.
 8mmarray = msc.numpy.memmap("msc://data-s3-iad/numpy-arrays/ndarray-1.bin")
 9
10# Reuse the client for the data-s3-iad profile and save an array.
11msc.numpy.save(
12    numpy.array([1, 2, 3, 4, 5], dtype=numpy.int32),
13    "msc://data-s3-iad/numpy-arrays/ndarray-2.npz"
14)

PyTorch

multistorageclient.torch aliases the multistorageclient.contrib.torch module.

This module provides load and save methods for loading and saving PyTorch data.

 1import multistorageclient as msc
 2import torch
 3
 4# Create a client for the data-s3-iad profile and load a tensor.
 5tensor = msc.torch.load("msc://data-s3-iad/pytorch-tensors/tensor-1.pt")
 6
 7# Reuse the client for the data-s3-iad profile and save a tensor.
 8msc.torch.save(
 9    torch.tensor([1, 2, 3, 4]),
10    "msc://data-s3-iad/pytorch-tensors/tensor-2.pt"
11)

Xarray

multistorageclient.xz aliases the multistorageclient.contrib.xarray module.

This module provides open_zarr for reading Xarray datasets from Zarr files/objects.

1import multistorageclient as msc
2
3# Create a client for the data-s3-iad profile and load a Zarr array into an Xarray dataset.
4xarray_dataset = msc.xz.open_zarr("msc://data-s3-iad/abc.zarr")

Note: Xarray supports fsspec URLs natively, so you can use Xarray standard interface with msc:// URLs.

1import xarray
2
3# Use Xarray native interface to load a Zarr array into an Xarray dataset.
4xarray_dataset = xarray.open_zarr("msc://data-s3-iad/abc.zarr")

Zarr

multistorageclient.zarr aliases the multistorageclient.contrib.zarr module.

This module provides open_consolidated for reading Zarr groups from files/objects.

1import multistorageclient as msc
2
3# Create a client for the data-s3-iad profile and load a Zarr array.
4z = msc.zarr.open_consolidated("msc://data-s3-iad/abc.zarr")

Note

Zarr supports fsspec URLs natively, so you can use Zarr standard interface with msc:// URLs.

1import zarr
2
3# Use Zarr native interface to load a Zarr array.
4z = zarr.open("msc://data-s3-iad/abc.zarr")

Path

multistorageclient.path aliases the multistorageclient.contrib.path module.

This module provides the Path class for working with paths in a way similar to pathlib.Path.

 1import multistorageclient as msc
 2
 3# Create a Path object for a file in the data-s3-iad profile
 4path = msc.Path("msc://data-s3-iad/data/file.txt")
 5
 6# Get parent directory
 7parent = path.parent  # msc://data-s3-iad/data
 8
 9# Get file name
10name = path.name  # file.txt
11
12# Join paths
13new_path = path.parent / "other.txt"  # msc://data-s3-iad/data/other.txt
14
15# Check if path exists
16exists = path.exists()
17
18# List contents of a directory
19for child in msc.Path("msc://data-s3-iad/data").iterdir():
20    print(child)
21
22# Find files matching a pattern
23for matched in msc.Path("msc://data-s3-iad/data").glob("*.txt"):
24    print(matched)

Note

The Path class implements much of the same interface as pathlib.Path, making it familiar to use while working with remote storage.