API Reference

Core

class multistorageclient.CacheConfig(location: str, size_mb: int, use_etag: bool)[source]

Configuration for the CacheManager.

Parameters:
  • location (str) –

  • size_mb (int) –

  • use_etag (bool) –

location: str

The directory where the cache is stored.

size_bytes() int[source]

Convert cache size from megabytes to bytes.

Returns:

The size of the cache in bytes.

Return type:

int

size_mb: int

The maximum size of the cache in megabytes.

use_etag: bool

Use etag to update the cached files.

class multistorageclient.StorageClient(config: StorageClientConfig)[source]

A client for interacting with different storage providers.

Initializes the StorageClient with the given configuration.

Parameters:

config (StorageClientConfig) – The configuration object for the storage client.

commit_updates(prefix: str | None = None) None[source]

Commits any pending updates to the metadata provider. No-op if not using a metadata provider.

Parameters:

prefix (str | None) – If provided, scans the prefix to find files to commit.

Return type:

None

delete(path: str) None[source]

Deletes an object from the storage provider at the specified path.

Parameters:

path (str) – The path of the object to delete.

Return type:

None

download_file(**kwargs: Any) Any
Parameters:
  • args (Any) –

  • kwargs (Any) –

Return type:

Any

glob(pattern: str, include_url_prefix: bool = False) List[str][source]

Matches and retrieves a list of objects in the storage provider that match the specified pattern.

Parameters:
  • pattern (str) – The pattern to match object paths against, supporting wildcards (e.g., *.txt).

  • include_url_prefix (bool) – Whether to include the URL prefix msc://profile in the result.

Returns:

A list of object paths that match the pattern.

Return type:

List[str]

info(path: str) ObjectMetadata[source]

Retrieves metadata or information about an object stored at the specified path.

Parameters:

path (str) – The path to the object for which metadata or information is being retrieved.

Returns:

A dictionary containing metadata or information about the object.

Return type:

ObjectMetadata

is_empty(path: str) bool[source]

Checks whether the specified path is empty. A path is considered empty if there are no objects whose keys start with the given path as a prefix.

Parameters:

path (str) – The path to check. This is typically a prefix representing a directory or folder.

Returns:

True if no objects exist under the specified path prefix, False otherwise.

Return type:

bool

is_file(path: str) bool[source]

Checks whether the specified path points to a file (rather than a directory or folder).

Parameters:

path (str) – The path to check.

Returns:

True if the path points to a file, False otherwise.

Return type:

bool

list(prefix: str = '', start_after: str | None = None, end_at: str | None = None) Iterator[ObjectMetadata][source]

Lists objects in the storage provider under the specified prefix.

Parameters:
  • prefix (str) – The prefix to list objects under.

  • start_after (str | None) – The key to start after (i.e. exclusive). An object with this key doesn’t have to exist.

  • end_at (str | None) – The key to end at (i.e. inclusive). An object with this key doesn’t have to exist.

Returns:

An iterator over objects.

Return type:

Iterator[ObjectMetadata]

open(path: str, mode: str = 'rb') PosixFile | ObjectFile[source]

Returns a file-like object from the storage provider at the specified path.

Parameters:
  • path (str) – The path of the object to read.

  • mode (str) – The file mode.

Returns:

A file-like object.

Return type:

PosixFile | ObjectFile

read(**kwargs: Any) Any
Parameters:
  • args (Any) –

  • kwargs (Any) –

Return type:

Any

upload_file(remote_path: str, local_path: str) None[source]

Uploads a file from the local file system to the storage provider.

Parameters:
  • remote_path (str) – The path where the file should be stored in the storage provider.

  • local_path (str) – The local path of the file to upload.

Return type:

None

write(path: str, body: bytes) None[source]

Writes an object to the storage provider at the specified path.

Parameters:
  • path (str) – The path where the object should be written.

  • body (bytes) – The content to write to the object.

Return type:

None

class multistorageclient.StorageClientConfig(profile: str, storage_provider: StorageProvider, credentials_provider: CredentialsProvider | None = None, metadata_provider: MetadataProvider | None = None, cache_config: CacheConfig | None = None, retry_config: RetryConfig | None = None)[source]

Configuration class for the multistorageclient.StorageClient.

Parameters:
cache_config: CacheConfig | None
cache_manager: CacheManager | None
credentials_provider: CredentialsProvider | None
static from_dict(config_dict: Dict[str, Any], profile: str = 'default') StorageClientConfig[source]
Parameters:
  • config_dict (Dict[str, Any]) –

  • profile (str) –

Return type:

StorageClientConfig

static from_file(profile: str = 'default') StorageClientConfig[source]
Parameters:

profile (str) –

Return type:

StorageClientConfig

static from_json(config_json: str, profile: str = 'default') StorageClientConfig[source]
Parameters:
  • config_json (str) –

  • profile (str) –

Return type:

StorageClientConfig

static from_provider_bundle(config_dict: Dict[str, Any], provider_bundle: ProviderBundle) StorageClientConfig[source]
Parameters:
  • config_dict (Dict[str, Any]) –

  • provider_bundle (ProviderBundle) –

Return type:

StorageClientConfig

static from_yaml(config_yaml: str, profile: str = 'default') StorageClientConfig[source]
Parameters:
  • config_yaml (str) –

  • profile (str) –

Return type:

StorageClientConfig

metadata_provider: MetadataProvider | None
profile: str
retry_config: RetryConfig | None
storage_provider: StorageProvider
multistorageclient.download_file(url: str, local_path: str) None[source]

Download a file in a given remote_path to a local path

The function utilizes the multistorageclient.StorageClient to download a file (object) at the provided path. The URL is parsed, and the corresponding multistorageclient.StorageClient is retrieved or built.

Parameters:
  • url (str) – The URL of the file to download. (example: msc://profile/prefix/dataset.tar)

  • local_path (str) – The local path where the file should be downloaded.

Raises:

ValueError – If the URL’s protocol does not match the expected protocol msc.

Return type:

None

multistorageclient.glob(pattern: str) List[str][source]

Return a list of files matching a pattern.

This function supports glob-style patterns for matching multiple files within a storage system. The pattern is parsed, and the associated multistorageclient.StorageClient is used to retrieve the list of matching files.

Parameters:

pattern (str) – The glob-style pattern to match files. (example: msc://profile/prefix/**/*.tar)

Returns:

A list of file paths matching the pattern.

Raises:

ValueError – If the URL’s protocol does not match the expected protocol msc.

Return type:

List[str]

multistorageclient.is_empty(url: str) bool[source]

Checks whether the specified URL contains any objects.

Parameters:

url (str) – The URL to check, typically pointing to a storage location.

Returns:

True if there are no objects/files under this URL, False otherwise.

Raises:

ValueError – If the URL’s protocol does not match the expected protocol msc.

Return type:

bool

multistorageclient.is_file(url: str) bool[source]

Checks whether the specified url points to a file (rather than a directory or folder).

The function utilizes the multistorageclient.StorageClient to check if a file (object) exists at the provided path. The URL is parsed, and the corresponding multistorageclient.StorageClient is retrieved or built.

Parameters:

url (str) – The URL to check the existence of a file. (example: msc://profile/prefix/dataset.tar)

Return type:

bool

multistorageclient.open(url: str, mode: str = 'rb') PosixFile | ObjectFile[source]

Open a file at the given URL using the specified mode.

The function utilizes the multistorageclient.StorageClient to open a file at the provided path. The URL is parsed, and the corresponding multistorageclient.StorageClient is retrieved or built.

Parameters:
  • url (str) – The URL of the file to open. (example: msc://profile/prefix/dataset.tar)

  • mode (str) – The file mode to open the file in.

Returns:

A file-like object that allows interaction with the file.

Raises:

ValueError – If the URL’s protocol does not match the expected protocol msc.

Return type:

PosixFile | ObjectFile

multistorageclient.resolve_storage_client(url: str) Tuple[StorageClient, str][source]

Build and return a multistorageclient.StorageClient instance based on the provided URL or path.

This function parses the given URL or path and determines the appropriate storage profile and path. It supports URLs with the protocol msc://, as well as POSIX paths or file:// URLs for local file system access. If the profile has already been instantiated, it returns the cached client. Otherwise, it creates a new StorageClient and caches it.

Parameters:

url (str) – The storage location, which can be: - A URL in the format msc://profile/path for object storage. - A local file system path (absolute POSIX path) or a file:// URL.

Returns:

A tuple containing the multistorageclient.StorageClient instance and the parsed path.

Raises:

ValueError – If the URL’s protocol is neither msc nor a valid local file system path.

Return type:

Tuple[StorageClient, str]

multistorageclient.upload_file(url: str, local_path: str) None[source]

Upload a file to the given URL from a local path.

The function utilizes the multistorageclient.StorageClient to upload a file (object) to the provided path. The URL is parsed, and the corresponding multistorageclient.StorageClient is retrieved or built.

Parameters:
  • url (str) – The URL of the file. (example: msc://profile/prefix/dataset.tar)

  • local_path (str) – The local path of the file.

Raises:

ValueError – If the URL’s protocol does not match the expected protocol msc.

Return type:

None

Types

class multistorageclient.types.Credentials(access_key: str, secret_key: str, token: str | None, expiration: str | None)[source]

A data class representing the credentials needed to access a storage provider.

Parameters:
  • access_key (str) –

  • secret_key (str) –

  • token (str | None) –

  • expiration (str | None) –

access_key: str

The access key for authentication.

expiration: str | None

The expiration time of the credentials in ISO 8601 format.

is_expired() bool[source]

Checks if the credentials are expired based on the expiration time.

Returns:

True if the credentials are expired, False otherwise.

Return type:

bool

secret_key: str

The secret key for authentication.

token: str | None

An optional security token for temporary credentials.

class multistorageclient.types.CredentialsProvider[source]

Abstract base class for providing credentials to access a storage provider.

abstract get_credentials() Credentials[source]

Retrieves the current credentials.

Returns:

The current credentials used for authentication.

Return type:

Credentials

abstract refresh_credentials() None[source]

Refreshes the credentials if they are expired or about to expire.

Return type:

None

class multistorageclient.types.MetadataProvider[source]

Abstract base class for accessing file metadata.

abstract add_file(path: str, metadata: ObjectMetadata) None[source]

Add a file to be tracked by the MetadataProvider. Does not have to be reflected in listing until a MetadataProvider.commit_updates() forces a persist.

Parameters:
  • path (str) – User-supplied path

  • metadata (ObjectMetadata) – file metadata

Return type:

None

abstract commit_updates() None[source]

Commit any newly adding files, used in conjunction with MetadataProvider.add_file(). MetadataProvider will persistently record any metadata changes.

Return type:

None

abstract get_object_metadata(path: str) ObjectMetadata[source]

Retrieves metadata or information about an object stored in the provider.

Parameters:

path (str) – The path of the object.

Returns:

A metadata object containing the information about the object.

Return type:

ObjectMetadata

abstract glob(pattern: str) List[str][source]

Matches and retrieves a list of object keys in the storage provider that match the specified pattern.

Parameters:

pattern (str) – The pattern to match object keys against, supporting wildcards (e.g., *.txt).

Returns:

A list of object keys that match the specified pattern.

Return type:

List[str]

abstract is_writable() bool[source]

Returns True if the MetadataProvider supports writes else False.

Return type:

bool

abstract list_objects(prefix: str, start_after: str | None = None, end_at: str | None = None) Iterator[ObjectMetadata][source]

Lists objects in the storage provider under the specified prefix.

Parameters:
  • prefix (str) – The prefix or path to list objects under.

  • start_after (str | None) – The key to start after (i.e. exclusive). An object with this key doesn’t have to exist.

  • end_at (str | None) – The key to end at (i.e. inclusive). An object with this key doesn’t have to exist.

Returns:

A iterator over objects metadata under the specified prefix.

Return type:

Iterator[ObjectMetadata]

abstract realpath(path: str) Tuple[str, bool][source]

Returns the canonical, full real physical path for use by a StorageProvider. This provides translation from user-visible paths to the canonical paths needed by a StorageProvider.

Parameters:

path (str) – user-supplied virtual path

Returns:

A canonical physical path and if the object at the path is valid

Return type:

Tuple[str, bool]

abstract remove_file(path: str) None[source]

Remove a file tracked by the MetadataProvider. Does not have to be reflected in listing until a MetadataProvider.commit_updates() forces a persist.

Parameters:

path (str) – User-supplied virtual path

Return type:

None

class multistorageclient.types.ObjectMetadata(key: str, content_length: int, last_modified: datetime, type: str = 'file', content_type: str | None = None, etag: str | None = None)[source]

A data class that represents the metadata associated with an object stored in a cloud storage service. This metadata includes both required and optional information about the object.

Parameters:
  • key (str) –

  • content_length (int) –

  • last_modified (datetime) –

  • type (str) –

  • content_type (str | None) –

  • etag (str | None) –

content_length: int

The size of the object in bytes.

content_type: str | None = None

The MIME type of the object.

etag: str | None = None

The entity tag (ETag) of the object.

static from_dict(data: dict) ObjectMetadata[source]

Creates an ObjectMetadata instance from a dictionary (parsed from JSON).

Parameters:

data (dict) –

Return type:

ObjectMetadata

key: str

Relative path of the object.

last_modified: datetime

The timestamp indicating when the object was last modified.

to_dict() dict[source]
Return type:

dict

type: str = 'file'
class multistorageclient.types.ProviderBundle[source]

Abstract base class that serves as a container for various providers (storage, credentials, and metadata) that interact with a storage service. The ProviderBundle abstracts access to these providers, allowing for flexible implementations of cloud storage solutions.

abstract property credentials_provider: CredentialsProvider | None
Returns:

The credentials provider responsible for managing authentication credentials required to access the storage service.

abstract property metadata_provider: MetadataProvider | None
Returns:

The metadata provider responsible for retrieving metadata about objects in the storage service.

abstract property storage_provider_config: StorageProviderConfig
Returns:

The configuration for the storage provider, which includes the provider name/type and additional options.

class multistorageclient.types.Range(offset: int, size: int)[source]

Byte-range read.

Parameters:
  • offset (int) –

  • size (int) –

offset: int
size: int
class multistorageclient.types.RetryConfig(attempts: int = 3, delay: float = 1.0)[source]

A data class that represents the configuration for retry strategy.

Parameters:
  • attempts (int) –

  • delay (float) –

attempts: int = 3

The number of attempts before giving up. Must be at least 1.

delay: float = 1.0

The delay (in seconds) between retry attempts. Must be a non-negative value.

exception multistorageclient.types.RetryableError[source]

Exception raised for errors that should trigger a retry.

class multistorageclient.types.StorageProvider[source]

Abstract base class for interacting with a storage provider.

abstract delete_object(path: str) None[source]

Deletes an object from the storage provider.

Parameters:

path (str) – The path of the object to delete.

Return type:

None

abstract download_file(remote_path: str, f: str | IO, metadata: ObjectMetadata | None = None) None[source]

Downloads a file from the storage provider to the local file system.

Parameters:
  • remote_path (str) – The path of the file to download.

  • f (str | IO) – The destination for the downloaded file. This can either be a string representing the local file path where the file will be saved, or a file-like object to write the downloaded content into.

  • metadata (ObjectMetadata | None) – Metadata about the object to download.

Return type:

None

abstract get_object(path: str, byte_range: Range | None = None) bytes[source]

Retrieves an object from the storage provider.

Parameters:
  • path (str) – The path where the object is stored.

  • byte_range (Range | None) –

Returns:

The content of the retrieved object.

Return type:

bytes

abstract get_object_metadata(path: str) ObjectMetadata[source]

Retrieves metadata or information about an object stored in the provider.

Parameters:

path (str) – The path of the object.

Returns:

A metadata object containing the information about the object.

Return type:

ObjectMetadata

abstract glob(pattern: str) List[str][source]

Matches and retrieves a list of object keys in the storage provider that match the specified pattern.

Parameters:

pattern (str) – The pattern to match object keys against, supporting wildcards (e.g., *.txt).

Returns:

A list of object keys that match the specified pattern.

Return type:

List[str]

abstract is_file(path: str) bool[source]

Checks whether the specified key in the storage provider points to a file (as opposed to a folder or directory).

Parameters:

path (str) – The path to check.

Returns:

True if the key points to a file, False if it points to a directory or folder.

Return type:

bool

abstract list_objects(prefix: str, start_after: str | None = None, end_at: str | None = None) Iterator[ObjectMetadata][source]

Lists objects in the storage provider under the specified prefix.

Parameters:
  • prefix (str) – The prefix or path to list objects under.

  • start_after (str | None) – The key to start after (i.e. exclusive). An object with this key doesn’t have to exist.

  • end_at (str | None) – The key to end at (i.e. inclusive). An object with this key doesn’t have to exist.

Returns:

An iterator over objects metadata under the specified prefix.

Return type:

Iterator[ObjectMetadata]

abstract put_object(path: str, body: bytes) None[source]

Uploads an object to the storage provider.

Parameters:
  • path (str) – The path where the object will be stored.

  • body (bytes) – The content of the object to store.

Return type:

None

abstract upload_file(remote_path: str, f: str | IO) None[source]

Uploads a file from the local file system to the storage provider.

Parameters:
  • remote_path (str) – The path where the object will be stored.

  • f (str | IO) – The source file to upload. This can either be a string representing the local file path, or a file-like object (e.g., an open file handle).

Return type:

None

class multistorageclient.types.StorageProviderConfig(type: str, options: Dict[str, Any] | None = None)[source]

A data class that represents the configuration needed to initialize a storage provider.

Parameters:
  • type (str) –

  • options (Dict[str, Any] | None) –

options: Dict[str, Any] | None = None

Additional options required to configure the storage provider (e.g., endpoint URLs, region, etc.).

type: str

The name or type of the storage provider (e.g., s3, gcs, oci, azure).

Providers

class multistorageclient.providers.AIStoreStorageProvider(endpoint: str, provider: str = 'ais', skip_verify: bool = True, ca_cert: str | None = None, timeout: float | Tuple[float, float] | None = None, base_path: str = '', credentials_provider: CredentialsProvider | None = None, **kwargs: Any)[source]

AIStore client for managing buckets, objects, and ETL jobs.

Parameters:
  • endpoint (str) – The AIStore endpoint.

  • skip_verify (bool) – Whether to skip SSL certificate verification.

  • ca_cert (str | None) – Path to a CA certificate file for SSL verification.

  • timeout (float | Tuple[float, float] | None) – Request timeout in seconds; a single float for both connect/read timeouts (e.g., 5.0), a tuple for separate connect/read timeouts (e.g., (3.0, 10.0)), or None to disable timeout.

  • token – Authorization token. If not provided, the AIS_AUTHN_TOKEN environment variable will be used.

  • base_path (str) – The root prefix path within the bucket where all operations will be scoped.

  • provider (str) –

  • credentials_provider (CredentialsProvider | None) –

  • kwargs (Any) –

class multistorageclient.providers.AzureBlobStorageProvider(endpoint_url: str, base_path: str = '', credentials_provider: CredentialsProvider | None = None)[source]

A concrete implementation of the multistorageclient.types.StorageProvider for interacting with Azure Blob Storage.

Initializes the AzureBlobStorageProvider with the endpoint URL and optional credentials provider.

Parameters:
  • endpoint_url (str) – The Azure storage account URL.

  • base_path (str) – The root prefix path within the container where all operations will be scoped.

  • credentials_provider (CredentialsProvider | None) – The provider to retrieve Azure credentials.

class multistorageclient.providers.GoogleStorageProvider(project_id: str, base_path: str = '', credentials_provider: CredentialsProvider | None = None)[source]

A concrete implementation of the multistorageclient.types.StorageProvider for interacting with Google Cloud Storage.

Initializes the GoogleStorageProvider with the project ID and optional credentials provider.

Parameters:
  • project_id (str) – The Google Cloud project ID.

  • base_path (str) – The root prefix path within the bucket where all operations will be scoped.

  • credentials_provider (CredentialsProvider | None) – The provider to retrieve GCS credentials.

class multistorageclient.providers.ManifestMetadataProvider(storage_provider: StorageProvider, manifest_path: str, writable: bool = False)[source]

Creates a ManifestMetadataProvider.

Parameters:
  • storage_provider (StorageProvider) – Storage provider.

  • manifest_path (str) – Main manifest file path.

  • writable (bool) – If true, allows modifications and new manifests to be written.

add_file(path: str, metadata: ObjectMetadata) None[source]

Add a file to be tracked by the MetadataProvider. Does not have to be reflected in listing until a MetadataProvider.commit_updates() forces a persist.

Parameters:
  • path (str) – User-supplied path

  • metadata (ObjectMetadata) – file metadata

Return type:

None

commit_updates() None[source]

Commit any newly adding files, used in conjunction with MetadataProvider.add_file(). MetadataProvider will persistently record any metadata changes.

Return type:

None

get_object_metadata(path: str) ObjectMetadata[source]

Retrieves metadata or information about an object stored in the provider.

Parameters:

path (str) – The path of the object.

Returns:

A metadata object containing the information about the object.

Return type:

ObjectMetadata

glob(pattern: str) List[str][source]

Matches and retrieves a list of object keys in the storage provider that match the specified pattern.

Parameters:

pattern (str) – The pattern to match object keys against, supporting wildcards (e.g., *.txt).

Returns:

A list of object keys that match the specified pattern.

Return type:

List[str]

is_writable() bool[source]

Returns True if the MetadataProvider supports writes else False.

Return type:

bool

list_objects(prefix: str, start_after: str | None = None, end_at: str | None = None) Iterator[ObjectMetadata][source]

Lists objects in the storage provider under the specified prefix.

Parameters:
  • prefix (str) – The prefix or path to list objects under.

  • start_after (str | None) – The key to start after (i.e. exclusive). An object with this key doesn’t have to exist.

  • end_at (str | None) – The key to end at (i.e. inclusive). An object with this key doesn’t have to exist.

Returns:

A iterator over objects metadata under the specified prefix.

Return type:

Iterator[ObjectMetadata]

realpath(path: str) Tuple[str, bool][source]

Returns the canonical, full real physical path for use by a StorageProvider. This provides translation from user-visible paths to the canonical paths needed by a StorageProvider.

Parameters:

path (str) – user-supplied virtual path

Returns:

A canonical physical path and if the object at the path is valid

Return type:

Tuple[str, bool]

remove_file(path: str) None[source]

Remove a file tracked by the MetadataProvider. Does not have to be reflected in listing until a MetadataProvider.commit_updates() forces a persist.

Parameters:

path (str) – User-supplied virtual path

Return type:

None

class multistorageclient.providers.OracleStorageProvider(namespace: str, base_path: str = '', credentials_provider: CredentialsProvider | None = None, **kwargs: Any)[source]

A concrete implementation of the multistorageclient.types.StorageProvider for interacting with Oracle Cloud Infrastructure (OCI) Object Storage.

Initializes the OracleStorageProvider with the region, compartment ID, and optional credentials provider.

Parameters:
  • region_name – The OCI region where the Object Storage is located.

  • compartment_id – The OCI compartment ID for the Object Storage.

  • base_path (str) – The root prefix path within the bucket where all operations will be scoped.

  • credentials_provider (CredentialsProvider | None) – The provider to retrieve OCI credentials.

  • namespace (str) –

  • kwargs (Any) –

class multistorageclient.providers.PosixFileStorageProvider(base_path: str, **kwargs: Any)[source]
Parameters:
  • base_path (str) –

  • kwargs (Any) –

glob(pattern: str) List[str][source]

Matches and retrieves a list of object keys in the storage provider that match the specified pattern.

Parameters:

pattern (str) – The pattern to match object keys against, supporting wildcards (e.g., *.txt).

Returns:

A list of object keys that match the specified pattern.

Return type:

List[str]

is_file(path: str) bool[source]

Checks whether the specified key in the storage provider points to a file (as opposed to a folder or directory).

Parameters:

path (str) – The path to check.

Returns:

True if the key points to a file, False if it points to a directory or folder.

Return type:

bool

class multistorageclient.providers.S3StorageProvider(region_name: str = '', endpoint_url: str = '', base_path: str = '', credentials_provider: CredentialsProvider | None = None, **kwargs: Any)[source]

A concrete implementation of the multistorageclient.types.StorageProvider for interacting with Amazon S3 or SwiftStack.

Initializes the S3StorageProvider with the region, endpoint URL, and optional credentials provider.

Parameters:
  • region_name (str) – The AWS region where the S3 bucket is located.

  • endpoint_url (str) – The custom endpoint URL for the S3 service.

  • base_path (str) – The root prefix path within the S3 bucket where all operations will be scoped.

  • credentials_provider (CredentialsProvider | None) – The provider to retrieve S3 credentials.

  • kwargs (Any) –

class multistorageclient.providers.StaticAISCredentialProvider(username: str | None = None, password: str | None = None, authn_endpoint: str | None = None, token: str | None = None, skip_verify: bool = True, ca_cert: str | None = None)[source]

A concrete implementation of the multistorageclient.types.CredentialsProvider that provides static S3 credentials.

Initializes the StaticAISCredentialProvider with the given credentials.

Parameters:
  • username (str | None) – The username for the AIStore authentication.

  • password (str | None) – The password for the AIStore authentication.

  • authn_endpoint (str | None) – The AIStore authentication endpoint.

  • token (str | None) – The AIStore authentication token. This is used for authentication if username, password and authn_endpoint are not provided.

  • skip_verify (bool) – If true, skip SSL certificate verification.

  • ca_cert (str | None) – Path to a CA certificate file for SSL verification.

get_credentials() Credentials[source]

Retrieves the current credentials.

Returns:

The current credentials used for authentication.

Return type:

Credentials

refresh_credentials() None[source]

Refreshes the credentials if they are expired or about to expire.

Return type:

None

class multistorageclient.providers.StaticAzureCredentialsProvider(connection: str)[source]

A concrete implementation of the multistorageclient.types.CredentialsProvider that provides static Azure credentials.

Initializes the StaticAzureCredentialsProvider with the provided connection string.

Parameters:

connection (str) – The connection string for Azure Blob Storage authentication.

get_credentials() Credentials[source]

Retrieves the current credentials.

Returns:

The current credentials used for authentication.

Return type:

Credentials

refresh_credentials() None[source]

Refreshes the credentials if they are expired or about to expire.

Return type:

None

class multistorageclient.providers.StaticS3CredentialsProvider(access_key: str, secret_key: str, session_token: str | None = None)[source]

A concrete implementation of the multistorageclient.types.CredentialsProvider that provides static S3 credentials.

Initializes the StaticS3CredentialsProvider with the provided access key, secret key, and optional session token.

Parameters:
  • access_key (str) – The access key for S3 authentication.

  • secret_key (str) – The secret key for S3 authentication.

  • session_token (str | None) – An optional session token for temporary credentials.

get_credentials() Credentials[source]

Retrieves the current credentials.

Returns:

The current credentials used for authentication.

Return type:

Credentials

refresh_credentials() None[source]

Refreshes the credentials if they are expired or about to expire.

Return type:

None

Higher-Level Libraries

fsspec

class multistorageclient.contrib.async_fs.MultiAsyncFileSystem(*args, **kwargs)[source]

Custom fsspec.asyn.AsyncFileSystem implementation for MSC protocol (msc://). Uses multistorageclient.StorageClient for backend operations.

Initializes the MultiAsyncFileSystem.

Parameters:

kwargs – Additional arguments for the fsspec.asyn.AsyncFileSystem.

static asynchronize_sync(func: Callable[[...], Any], *args: Any, **kwargs: Any) Any[source]

Runs a synchronous function asynchronously using asyncio.

Parameters:
  • func (Callable[[...], Any]) – The synchronous function to be executed asynchronously.

  • args (Any) – Positional arguments to pass to the function.

  • kwargs (Any) – Keyword arguments to pass to the function.

Returns:

The result of the asynchronous execution of the function.

Return type:

Any

cat_file(path: str, **kwargs: Any) bytes[source]

Reads the contents of a file at the given path.

Parameters:
  • path (str) – The file path to read from.

  • kwargs (Any) – Additional arguments for file reading functionality.

Returns:

The contents of the file as bytes.

Return type:

bytes

get_file(rpath: str, lpath: str, **kwargs: Any) None[source]

Downloads a file from the remote path to the local path.

Parameters:
  • rpath (str) – The remote path of the file to download.

  • lpath (str) – The local path to store the file.

  • kwargs (Any) – Additional arguments for file retrieval functionality.

Return type:

None

glob(path: str, maxdepth: int | None = None, **kwargs: Any) List[str][source]

Matches and retrieves a list of objects in the storage provider that match the specified pattern.

Parameters:
  • path (str) – The pattern to match object paths against, supporting wildcards (e.g., *.txt).

  • maxdepth (int | None) – maxdepth of the pattern match

  • kwargs (Any) –

Return type:

List[str]

Returns:

A list of object paths that match the pattern.

info(path: str, **kwargs: Any) Dict[str, Any][source]

Retrieves metadata information for a file.

Parameters:
  • path (str) – The file path to retrieve information for.

  • kwargs (Any) – Additional arguments for info functionality.

Returns:

A dictionary containing file metadata such as ETag, last modified, and size.

Return type:

Dict[str, Any]

ls(path: str, detail: bool = True, **kwargs: Any) List[Dict[str, Any]] | List[str][source]

Lists the contents of a directory.

Parameters:
  • path (str) – The directory path to list.

  • detail (bool) – Whether to return detailed information for each file.

  • kwargs (Any) – Additional arguments for list functionality.

Returns:

A list of file names or detailed information depending on the ‘detail’ argument.

Return type:

List[Dict[str, Any]] | List[str]

open(path: str, mode: str = 'rb', **kwargs: Any) PosixFile | ObjectFile[source]

Opens a file at the given path.

Parameters:
  • path (str) – The file path to open.

  • mode (str) – The mode in which to open the file.

  • kwargs (Any) – Additional arguments for file opening.

Returns:

A ManagedFile object representing the opened file.

Return type:

PosixFile | ObjectFile

pipe_file(path: str, value: bytes, **kwargs: Any) None[source]

Writes a value (bytes) directly to a file at the given path.

Parameters:
  • path (str) – The file path to write the value to.

  • value (bytes) – The bytes to write to the file.

  • kwargs (Any) – Additional arguments for writing functionality.

Return type:

None

protocol: ClassVar[str | tuple[str, ...]] = 'msc'
put_file(lpath: str, rpath: str, **kwargs: Any) None[source]

Uploads a local file to the remote path.

Parameters:
  • lpath (str) – The local path of the file to upload.

  • rpath (str) – The remote path to store the file.

  • kwargs (Any) – Additional arguments for file upload functionality.

Return type:

None

resolve_path_and_storage_client(path: str) Tuple[StorageClient, str][source]

Resolves the path and retrieves the associated multistorageclient.StorageClient.

Parameters:

path (str) – The file path to resolve.

Returns:

A tuple containing the multistorageclient.StorageClient and the resolved path.

Return type:

Tuple[StorageClient, str]

rm(path: str, recursive: bool = False, **kwargs: Any) None[source]

Removes a file or directory.

Parameters:
  • path (str) – The file or directory path to remove.

  • recursive (bool) – If True, will remove directories and their contents recursively.

  • kwargs (Any) – Additional arguments for remove functionality.

Raises:

IsADirectoryError – If the path is a directory and recursive is not set to True.

Return type:

None

NumPy

multistorageclient.contrib.numpy.load(*args: Any, **kwargs: Any) ndarray | Dict[str, ndarray] | NpzFile[source]

Adapt numpy.load.

Parameters:
  • args (Any) –

  • kwargs (Any) –

Return type:

ndarray | Dict[str, ndarray] | NpzFile

multistorageclient.contrib.numpy.memmap(*args: Any, **kwargs: Any) memmap[source]

Adapt numpy.memmap.

Parameters:
  • args (Any) –

  • kwargs (Any) –

Return type:

memmap

multistorageclient.contrib.numpy.save(*args: Any, **kwargs: Any) None[source]

Adapt numpy.save.

Parameters:
  • args (Any) –

  • kwargs (Any) –

Return type:

None

PyTorch

multistorageclient.contrib.torch.load(f: str | PathLike | BinaryIO | IO[bytes], *args: Any, **kwargs: Any) Any[source]

Adapt torch.load.

Parameters:
  • f (str | PathLike | BinaryIO | IO[bytes]) –

  • args (Any) –

  • kwargs (Any) –

Return type:

Any

multistorageclient.contrib.torch.save(obj: object, f: str | PathLike | BinaryIO | IO[bytes], *args: Any, **kwargs: Any) Any[source]

Adapt torch.save.

Parameters:
  • obj (object) –

  • f (str | PathLike | BinaryIO | IO[bytes]) –

  • args (Any) –

  • kwargs (Any) –

Return type:

Any

Xarray

multistorageclient.contrib.xarray.open_zarr(*args: Any, **kwargs: Any) Dataset[source]

Adapt xarray.open_zarr to use multistorageclient.contrib.zarr.LazyZarrStore when path matches the msc protocol.

If the path starts with the MSC protocol, it uses multistorageclient.contrib.zarr.LazyZarrStore with a resolved storage client and prefix, passing msc_max_workers if provided. Otherwise, it directly calls xarray.open_zarr.

Parameters:
  • args (Any) –

  • kwargs (Any) –

Return type:

Dataset

Zarr

class multistorageclient.contrib.zarr.LazyZarrStore(storage_client: StorageClient, prefix: str = '', msc_max_workers: int | None = None)[source]
Parameters:
  • storage_client (StorageClient) –

  • prefix (str) –

  • msc_max_workers (int | None) –

getitems(keys: Sequence[str], *, contexts: Any) Mapping[str, Any][source]

Retrieve data from multiple keys.

Parameters

keysIterable[str]

The keys to retrieve

contexts: Mapping[str, Context]

A mapping of keys to their context. Each context is a mapping of store specific information. E.g. a context could be a dict telling the store the preferred output array type: {“meta_array”: cupy.empty(())}

Returns

Mapping

A collection mapping the input keys to their results.

Notes

This default implementation uses __getitem__() to read each key sequentially and ignores contexts. Overwrite this method to implement concurrent reads of multiple keys and/or to utilize the contexts.

Parameters:
  • keys (Sequence[str]) –

  • contexts (Any) –

Return type:

Mapping[str, Any]

keys() a set-like object providing a view on D's keys[source]
Return type:

Iterator[str]

multistorageclient.contrib.zarr.open_consolidated(*args: Any, **kwargs: Any) Group[source]

Adapt zarr.open_consolidated to use LazyZarrStore when path matches the msc protocol.

If the path starts with the MSC protocol, it uses LazyZarrStore with a resolved storage client and prefix, passing msc_max_workers if provided. Otherwise, it directly calls zarr.open_consolidated.

Parameters:
  • args (Any) –

  • kwargs (Any) –

Return type:

Group