Higher-Level Libraries¶
The MSC adapters for higher-level libraries use shortcuts under the hood.
fsspec¶
multistorageclient.async_fs
aliases the multistorageclient.contrib.async_fs
module.
This module provides the multistorageclient.contrib.async_fs.MultiAsyncFileSystem
class which
implements fsspec’s AsyncFileSystem
class.
Note
The msc://
protocol is automatically registered when pip install multi-storage-client
is run.
1import multistorageclient as msc
2
3# Create an MSC-based AsyncFileSystem instance.
4fs = msc.async_fs.MultiAsyncFileSystem()
5
6# Create a client for the data-s3-iad profile and open a file.
7file = fs.open("msc://data-s3-iad/animal-photos/red-panda.png")
8
9# Reuse the client for the data-s3-iad profile and download a file.
10fs.get_file(
11 rpath="msc://data-s3-iad/animal-photos/red-panda.png",
12 lpath="/tmp/animal-photos/red-panda.png"
13)
NumPy¶
multistorageclient.numpy
aliases the multistorageclient.contrib.numpy
module.
This module provides load
, memmap
, and save
methods for loading and saving NumPy arrays.
1import multistorageclient as msc
2import numpy
3
4# Create a client for the data-s3-iad profile and load an array.
5array = msc.numpy.load("msc://data-s3-iad/numpy-arrays/ndarray-1.npz")
6
7# Reuse the client for the data-s3-iad profile and load a memory-mapped array.
8mmarray = msc.numpy.memmap("msc://data-s3-iad/numpy-arrays/ndarray-1.bin")
9
10# Reuse the client for the data-s3-iad profile and save an array.
11msc.numpy.save(
12 numpy.array([1, 2, 3, 4, 5], dtype=numpy.int32),
13 "msc://data-s3-iad/numpy-arrays/ndarray-2.npz"
14)
PyTorch¶
multistorageclient.torch
aliases the multistorageclient.contrib.torch
module.
This module provides load
and save
methods for loading and saving PyTorch data.
1import multistorageclient as msc
2import torch
3
4# Create a client for the data-s3-iad profile and load a tensor.
5tensor = msc.torch.load("msc://data-s3-iad/pytorch-tensors/tensor-1.pt")
6
7# Reuse the client for the data-s3-iad profile and save a tensor.
8msc.torch.save(
9 torch.tensor([1, 2, 3, 4]),
10 "msc://data-s3-iad/pytorch-tensors/tensor-2.pt"
11)
Xarray¶
multistorageclient.xz
aliases the multistorageclient.contrib.xarray
module.
This module provides open_zarr
for reading Xarray datasets from Zarr files/objects.
1import multistorageclient as msc
2
3# Create a client for the data-s3-iad profile and load a Zarr array into an Xarray dataset.
4xarray_dataset = msc.xz.open_zarr("msc://data-s3-iad/abc.zarr")
Note: Xarray
supports fsspec URLs natively, so you can use Xarray standard interface with msc://
URLs.
1import xarray
2
3# Use Xarray native interface to load a Zarr array into an Xarray dataset.
4xarray_dataset = xarray.open_zarr("msc://data-s3-iad/abc.zarr")
Zarr¶
multistorageclient.zarr
aliases the multistorageclient.contrib.zarr
module.
This module provides open_consolidated
for reading Zarr groups from files/objects.
1import multistorageclient as msc
2
3# Create a client for the data-s3-iad profile and load a Zarr array.
4z = msc.zarr.open_consolidated("msc://data-s3-iad/abc.zarr")
Note
Zarr
supports fsspec URLs natively, so you can use Zarr standard interface with msc://
URLs.
1import zarr
2
3# Use Zarr native interface to load a Zarr array.
4z = zarr.open("msc://data-s3-iad/abc.zarr")
Path¶
multistorageclient.path
aliases the multistorageclient.contrib.path
module.
This module provides the Path
class for working with paths in a way similar to pathlib.Path
.
1import multistorageclient as msc
2
3# Create a Path object for a file in the data-s3-iad profile
4path = msc.Path("msc://data-s3-iad/data/file.txt")
5
6# Get parent directory
7parent = path.parent # msc://data-s3-iad/data
8
9# Get file name
10name = path.name # file.txt
11
12# Join paths
13new_path = path.parent / "other.txt" # msc://data-s3-iad/data/other.txt
14
15# Check if path exists
16exists = path.exists()
17
18# List contents of a directory
19for child in msc.Path("msc://data-s3-iad/data").iterdir():
20 print(child)
21
22# Find files matching a pattern
23for matched in msc.Path("msc://data-s3-iad/data").glob("*.txt"):
24 print(matched)
Note
The Path
class implements much of the same interface as pathlib.Path
, making it familiar to use while working with remote storage.