API Reference¶
Core¶
- class multistorageclient.CacheConfig(location: str, size_mb: int, use_etag: bool)[source]¶
Configuration for the
CacheManager
.- Parameters:
location (str) –
size_mb (int) –
use_etag (bool) –
- location: str¶
The directory where the cache is stored.
- size_bytes() int [source]¶
Convert cache size from megabytes to bytes.
- Returns:
The size of the cache in bytes.
- Return type:
int
- size_mb: int¶
The maximum size of the cache in megabytes.
- use_etag: bool¶
Use etag to update the cached files.
- class multistorageclient.StorageClient(config: StorageClientConfig)[source]¶
A client for interacting with different storage providers.
Initializes the
StorageClient
with the given configuration.- Parameters:
config (StorageClientConfig) – The configuration object for the storage client.
- commit_updates(prefix: str | None = None) None [source]¶
Commits any pending updates to the metadata provider. No-op if not using a metadata provider.
- Parameters:
prefix (str | None) – If provided, scans the prefix to find files to commit.
- Return type:
None
- delete(path: str) None [source]¶
Deletes an object from the storage provider at the specified path.
- Parameters:
path (str) – The path of the object to delete.
- Return type:
None
- download_file(**kwargs: Any) Any ¶
- Parameters:
args (Any) –
kwargs (Any) –
- Return type:
Any
- glob(pattern: str, include_url_prefix: bool = False) List[str] [source]¶
Matches and retrieves a list of objects in the storage provider that match the specified pattern.
- Parameters:
pattern (str) – The pattern to match object paths against, supporting wildcards (e.g.,
*.txt
).include_url_prefix (bool) – Whether to include the URL prefix
msc://profile
in the result.
- Returns:
A list of object paths that match the pattern.
- Return type:
List[str]
- info(path: str) ObjectMetadata [source]¶
Retrieves metadata or information about an object stored at the specified path.
- Parameters:
path (str) – The path to the object for which metadata or information is being retrieved.
- Returns:
A dictionary containing metadata or information about the object.
- Return type:
- is_empty(path: str) bool [source]¶
Checks whether the specified path is empty. A path is considered empty if there are no objects whose keys start with the given path as a prefix.
- Parameters:
path (str) – The path to check. This is typically a prefix representing a directory or folder.
- Returns:
True
if no objects exist under the specified path prefix,False
otherwise.- Return type:
bool
- is_file(path: str) bool [source]¶
Checks whether the specified path points to a file (rather than a directory or folder).
- Parameters:
path (str) – The path to check.
- Returns:
True
if the path points to a file,False
otherwise.- Return type:
bool
- list(prefix: str = '', start_after: str | None = None, end_at: str | None = None) Iterator[ObjectMetadata] [source]¶
Lists objects in the storage provider under the specified prefix.
- Parameters:
prefix (str) – The prefix to list objects under.
start_after (str | None) – The key to start after (i.e. exclusive). An object with this key doesn’t have to exist.
end_at (str | None) – The key to end at (i.e. inclusive). An object with this key doesn’t have to exist.
- Returns:
An iterator over objects.
- Return type:
Iterator[ObjectMetadata]
- open(path: str, mode: str = 'rb') PosixFile | ObjectFile [source]¶
Returns a file-like object from the storage provider at the specified path.
- Parameters:
path (str) – The path of the object to read.
mode (str) – The file mode.
- Returns:
A file-like object.
- Return type:
PosixFile | ObjectFile
- read(**kwargs: Any) Any ¶
- Parameters:
args (Any) –
kwargs (Any) –
- Return type:
Any
- upload_file(remote_path: str, local_path: str) None [source]¶
Uploads a file from the local file system to the storage provider.
- Parameters:
remote_path (str) – The path where the file should be stored in the storage provider.
local_path (str) – The local path of the file to upload.
- Return type:
None
- class multistorageclient.StorageClientConfig(profile: str, storage_provider: StorageProvider, credentials_provider: CredentialsProvider | None = None, metadata_provider: MetadataProvider | None = None, cache_config: CacheConfig | None = None, retry_config: RetryConfig | None = None)[source]¶
Configuration class for the
multistorageclient.StorageClient
.- Parameters:
profile (str) –
storage_provider (StorageProvider) –
credentials_provider (CredentialsProvider | None) –
metadata_provider (MetadataProvider | None) –
cache_config (CacheConfig | None) –
retry_config (RetryConfig | None) –
- cache_config: CacheConfig | None¶
- cache_manager: CacheManager | None¶
- credentials_provider: CredentialsProvider | None¶
- static from_dict(config_dict: Dict[str, Any], profile: str = 'default') StorageClientConfig [source]¶
- Parameters:
config_dict (Dict[str, Any]) –
profile (str) –
- Return type:
- static from_file(profile: str = 'default') StorageClientConfig [source]¶
- Parameters:
profile (str) –
- Return type:
- static from_json(config_json: str, profile: str = 'default') StorageClientConfig [source]¶
- Parameters:
config_json (str) –
profile (str) –
- Return type:
- static from_provider_bundle(config_dict: Dict[str, Any], provider_bundle: ProviderBundle) StorageClientConfig [source]¶
- Parameters:
config_dict (Dict[str, Any]) –
provider_bundle (ProviderBundle) –
- Return type:
- static from_yaml(config_yaml: str, profile: str = 'default') StorageClientConfig [source]¶
- Parameters:
config_yaml (str) –
profile (str) –
- Return type:
- metadata_provider: MetadataProvider | None¶
- profile: str¶
- retry_config: RetryConfig | None¶
- storage_provider: StorageProvider¶
- multistorageclient.download_file(url: str, local_path: str) None [source]¶
Download a file in a given remote_path to a local path
The function utilizes the
multistorageclient.StorageClient
to download a file (object) at the provided path. The URL is parsed, and the correspondingmultistorageclient.StorageClient
is retrieved or built.- Parameters:
url (str) – The URL of the file to download. (example:
msc://profile/prefix/dataset.tar
)local_path (str) – The local path where the file should be downloaded.
- Raises:
ValueError – If the URL’s protocol does not match the expected protocol
msc
.- Return type:
None
- multistorageclient.glob(pattern: str) List[str] [source]¶
Return a list of files matching a pattern.
This function supports glob-style patterns for matching multiple files within a storage system. The pattern is parsed, and the associated
multistorageclient.StorageClient
is used to retrieve the list of matching files.- Parameters:
pattern (str) – The glob-style pattern to match files. (example:
msc://profile/prefix/**/*.tar
)- Returns:
A list of file paths matching the pattern.
- Raises:
ValueError – If the URL’s protocol does not match the expected protocol
msc
.- Return type:
List[str]
- multistorageclient.is_empty(url: str) bool [source]¶
Checks whether the specified URL contains any objects.
- Parameters:
url (str) – The URL to check, typically pointing to a storage location.
- Returns:
True
if there are no objects/files under this URL,False
otherwise.- Raises:
ValueError – If the URL’s protocol does not match the expected protocol
msc
.- Return type:
bool
- multistorageclient.is_file(url: str) bool [source]¶
Checks whether the specified url points to a file (rather than a directory or folder).
The function utilizes the
multistorageclient.StorageClient
to check if a file (object) exists at the provided path. The URL is parsed, and the correspondingmultistorageclient.StorageClient
is retrieved or built.- Parameters:
url (str) – The URL to check the existence of a file. (example:
msc://profile/prefix/dataset.tar
)- Return type:
bool
- multistorageclient.open(url: str, mode: str = 'rb') PosixFile | ObjectFile [source]¶
Open a file at the given URL using the specified mode.
The function utilizes the
multistorageclient.StorageClient
to open a file at the provided path. The URL is parsed, and the correspondingmultistorageclient.StorageClient
is retrieved or built.- Parameters:
url (str) – The URL of the file to open. (example:
msc://profile/prefix/dataset.tar
)mode (str) – The file mode to open the file in.
- Returns:
A file-like object that allows interaction with the file.
- Raises:
ValueError – If the URL’s protocol does not match the expected protocol
msc
.- Return type:
PosixFile | ObjectFile
- multistorageclient.resolve_storage_client(url: str) Tuple[StorageClient, str] [source]¶
Build and return a
multistorageclient.StorageClient
instance based on the provided URL or path.This function parses the given URL or path and determines the appropriate storage profile and path. It supports URLs with the protocol
msc://
, as well as POSIX paths orfile://
URLs for local file system access. If the profile has already been instantiated, it returns the cached client. Otherwise, it creates a newStorageClient
and caches it.- Parameters:
url (str) – The storage location, which can be: - A URL in the format
msc://profile/path
for object storage. - A local file system path (absolute POSIX path) or afile://
URL.- Returns:
A tuple containing the
multistorageclient.StorageClient
instance and the parsed path.- Raises:
ValueError – If the URL’s protocol is neither
msc
nor a valid local file system path.- Return type:
Tuple[StorageClient, str]
- multistorageclient.upload_file(url: str, local_path: str) None [source]¶
Upload a file to the given URL from a local path.
The function utilizes the
multistorageclient.StorageClient
to upload a file (object) to the provided path. The URL is parsed, and the correspondingmultistorageclient.StorageClient
is retrieved or built.- Parameters:
url (str) – The URL of the file. (example:
msc://profile/prefix/dataset.tar
)local_path (str) – The local path of the file.
- Raises:
ValueError – If the URL’s protocol does not match the expected protocol
msc
.- Return type:
None
Types¶
- class multistorageclient.types.Credentials(access_key: str, secret_key: str, token: str | None, expiration: str | None)[source]¶
A data class representing the credentials needed to access a storage provider.
- Parameters:
access_key (str) –
secret_key (str) –
token (str | None) –
expiration (str | None) –
- access_key: str¶
The access key for authentication.
- expiration: str | None¶
The expiration time of the credentials in ISO 8601 format.
- is_expired() bool [source]¶
Checks if the credentials are expired based on the expiration time.
- Returns:
True
if the credentials are expired,False
otherwise.- Return type:
bool
- secret_key: str¶
The secret key for authentication.
- token: str | None¶
An optional security token for temporary credentials.
- class multistorageclient.types.CredentialsProvider[source]¶
Abstract base class for providing credentials to access a storage provider.
- abstract get_credentials() Credentials [source]¶
Retrieves the current credentials.
- Returns:
The current credentials used for authentication.
- Return type:
- class multistorageclient.types.MetadataProvider[source]¶
Abstract base class for accessing file metadata.
- abstract add_file(path: str, metadata: ObjectMetadata) None [source]¶
Add a file to be tracked by the
MetadataProvider
. Does not have to be reflected in listing until aMetadataProvider.commit_updates()
forces a persist.- Parameters:
path (str) – User-supplied path
metadata (ObjectMetadata) – file metadata
- Return type:
None
- abstract commit_updates() None [source]¶
Commit any newly adding files, used in conjunction with
MetadataProvider.add_file()
.MetadataProvider
will persistently record any metadata changes.- Return type:
None
- abstract get_object_metadata(path: str) ObjectMetadata [source]¶
Retrieves metadata or information about an object stored in the provider.
- Parameters:
path (str) – The path of the object.
- Returns:
A metadata object containing the information about the object.
- Return type:
- abstract glob(pattern: str) List[str] [source]¶
Matches and retrieves a list of object keys in the storage provider that match the specified pattern.
- Parameters:
pattern (str) – The pattern to match object keys against, supporting wildcards (e.g.,
*.txt
).- Returns:
A list of object keys that match the specified pattern.
- Return type:
List[str]
- abstract is_writable() bool [source]¶
Returns
True
if theMetadataProvider
supports writes elseFalse
.- Return type:
bool
- abstract list_objects(prefix: str, start_after: str | None = None, end_at: str | None = None) Iterator[ObjectMetadata] [source]¶
Lists objects in the storage provider under the specified prefix.
- Parameters:
prefix (str) – The prefix or path to list objects under.
start_after (str | None) – The key to start after (i.e. exclusive). An object with this key doesn’t have to exist.
end_at (str | None) – The key to end at (i.e. inclusive). An object with this key doesn’t have to exist.
- Returns:
A iterator over objects metadata under the specified prefix.
- Return type:
Iterator[ObjectMetadata]
- abstract realpath(path: str) Tuple[str, bool] [source]¶
Returns the canonical, full real physical path for use by a
StorageProvider
. This provides translation from user-visible paths to the canonical paths needed by aStorageProvider
.- Parameters:
path (str) – user-supplied virtual path
- Returns:
A canonical physical path and if the object at the path is valid
- Return type:
Tuple[str, bool]
- abstract remove_file(path: str) None [source]¶
Remove a file tracked by the
MetadataProvider
. Does not have to be reflected in listing until aMetadataProvider.commit_updates()
forces a persist.- Parameters:
path (str) – User-supplied virtual path
- Return type:
None
- class multistorageclient.types.ObjectMetadata(key: str, content_length: int, last_modified: datetime, type: str = 'file', content_type: str | None = None, etag: str | None = None)[source]¶
A data class that represents the metadata associated with an object stored in a cloud storage service. This metadata includes both required and optional information about the object.
- Parameters:
key (str) –
content_length (int) –
last_modified (datetime) –
type (str) –
content_type (str | None) –
etag (str | None) –
- content_length: int¶
The size of the object in bytes.
- content_type: str | None = None¶
The MIME type of the object.
- etag: str | None = None¶
The entity tag (ETag) of the object.
- static from_dict(data: dict) ObjectMetadata [source]¶
Creates an ObjectMetadata instance from a dictionary (parsed from JSON).
- Parameters:
data (dict) –
- Return type:
- key: str¶
Relative path of the object.
- last_modified: datetime¶
The timestamp indicating when the object was last modified.
- type: str = 'file'¶
- class multistorageclient.types.ProviderBundle[source]¶
Abstract base class that serves as a container for various providers (storage, credentials, and metadata) that interact with a storage service. The
ProviderBundle
abstracts access to these providers, allowing for flexible implementations of cloud storage solutions.- abstract property credentials_provider: CredentialsProvider | None¶
- Returns:
The credentials provider responsible for managing authentication credentials required to access the storage service.
- abstract property metadata_provider: MetadataProvider | None¶
- Returns:
The metadata provider responsible for retrieving metadata about objects in the storage service.
- abstract property storage_provider_config: StorageProviderConfig¶
- Returns:
The configuration for the storage provider, which includes the provider name/type and additional options.
- class multistorageclient.types.Range(offset: int, size: int)[source]¶
Byte-range read.
- Parameters:
offset (int) –
size (int) –
- offset: int¶
- size: int¶
- class multistorageclient.types.RetryConfig(attempts: int = 3, delay: float = 1.0)[source]¶
A data class that represents the configuration for retry strategy.
- Parameters:
attempts (int) –
delay (float) –
- attempts: int = 3¶
The number of attempts before giving up. Must be at least 1.
- delay: float = 1.0¶
The delay (in seconds) between retry attempts. Must be a non-negative value.
- exception multistorageclient.types.RetryableError[source]¶
Exception raised for errors that should trigger a retry.
- class multistorageclient.types.StorageProvider[source]¶
Abstract base class for interacting with a storage provider.
- abstract delete_object(path: str) None [source]¶
Deletes an object from the storage provider.
- Parameters:
path (str) – The path of the object to delete.
- Return type:
None
- abstract download_file(remote_path: str, f: str | IO, metadata: ObjectMetadata | None = None) None [source]¶
Downloads a file from the storage provider to the local file system.
- Parameters:
remote_path (str) – The path of the file to download.
f (str | IO) – The destination for the downloaded file. This can either be a string representing the local file path where the file will be saved, or a file-like object to write the downloaded content into.
metadata (ObjectMetadata | None) – Metadata about the object to download.
- Return type:
None
- abstract get_object(path: str, byte_range: Range | None = None) bytes [source]¶
Retrieves an object from the storage provider.
- Parameters:
path (str) – The path where the object is stored.
byte_range (Range | None) –
- Returns:
The content of the retrieved object.
- Return type:
bytes
- abstract get_object_metadata(path: str) ObjectMetadata [source]¶
Retrieves metadata or information about an object stored in the provider.
- Parameters:
path (str) – The path of the object.
- Returns:
A metadata object containing the information about the object.
- Return type:
- abstract glob(pattern: str) List[str] [source]¶
Matches and retrieves a list of object keys in the storage provider that match the specified pattern.
- Parameters:
pattern (str) – The pattern to match object keys against, supporting wildcards (e.g.,
*.txt
).- Returns:
A list of object keys that match the specified pattern.
- Return type:
List[str]
- abstract is_file(path: str) bool [source]¶
Checks whether the specified key in the storage provider points to a file (as opposed to a folder or directory).
- Parameters:
path (str) – The path to check.
- Returns:
True
if the key points to a file,False
if it points to a directory or folder.- Return type:
bool
- abstract list_objects(prefix: str, start_after: str | None = None, end_at: str | None = None) Iterator[ObjectMetadata] [source]¶
Lists objects in the storage provider under the specified prefix.
- Parameters:
prefix (str) – The prefix or path to list objects under.
start_after (str | None) – The key to start after (i.e. exclusive). An object with this key doesn’t have to exist.
end_at (str | None) – The key to end at (i.e. inclusive). An object with this key doesn’t have to exist.
- Returns:
An iterator over objects metadata under the specified prefix.
- Return type:
Iterator[ObjectMetadata]
- abstract put_object(path: str, body: bytes) None [source]¶
Uploads an object to the storage provider.
- Parameters:
path (str) – The path where the object will be stored.
body (bytes) – The content of the object to store.
- Return type:
None
- abstract upload_file(remote_path: str, f: str | IO) None [source]¶
Uploads a file from the local file system to the storage provider.
- Parameters:
remote_path (str) – The path where the object will be stored.
f (str | IO) – The source file to upload. This can either be a string representing the local file path, or a file-like object (e.g., an open file handle).
- Return type:
None
- class multistorageclient.types.StorageProviderConfig(type: str, options: Dict[str, Any] | None = None)[source]¶
A data class that represents the configuration needed to initialize a storage provider.
- Parameters:
type (str) –
options (Dict[str, Any] | None) –
- options: Dict[str, Any] | None = None¶
Additional options required to configure the storage provider (e.g., endpoint URLs, region, etc.).
- type: str¶
The name or type of the storage provider (e.g.,
s3
,gcs
,oci
,azure
).
Providers¶
- class multistorageclient.providers.AIStoreStorageProvider(endpoint: str, provider: str = 'ais', skip_verify: bool = True, ca_cert: str | None = None, timeout: float | Tuple[float, float] | None = None, base_path: str = '', credentials_provider: CredentialsProvider | None = None, **kwargs: Any)[source]¶
AIStore client for managing buckets, objects, and ETL jobs.
- Parameters:
endpoint (str) – The AIStore endpoint.
skip_verify (bool) – Whether to skip SSL certificate verification.
ca_cert (str | None) – Path to a CA certificate file for SSL verification.
timeout (float | Tuple[float, float] | None) – Request timeout in seconds; a single float for both connect/read timeouts (e.g.,
5.0
), a tuple for separate connect/read timeouts (e.g.,(3.0, 10.0)
), orNone
to disable timeout.token – Authorization token. If not provided, the
AIS_AUTHN_TOKEN
environment variable will be used.base_path (str) – The root prefix path within the bucket where all operations will be scoped.
provider (str) –
credentials_provider (CredentialsProvider | None) –
kwargs (Any) –
- class multistorageclient.providers.AzureBlobStorageProvider(endpoint_url: str, base_path: str = '', credentials_provider: CredentialsProvider | None = None)[source]¶
A concrete implementation of the
multistorageclient.types.StorageProvider
for interacting with Azure Blob Storage.Initializes the
AzureBlobStorageProvider
with the endpoint URL and optional credentials provider.- Parameters:
endpoint_url (str) – The Azure storage account URL.
base_path (str) – The root prefix path within the container where all operations will be scoped.
credentials_provider (CredentialsProvider | None) – The provider to retrieve Azure credentials.
- class multistorageclient.providers.GoogleStorageProvider(project_id: str, base_path: str = '', credentials_provider: CredentialsProvider | None = None)[source]¶
A concrete implementation of the
multistorageclient.types.StorageProvider
for interacting with Google Cloud Storage.Initializes the
GoogleStorageProvider
with the project ID and optional credentials provider.- Parameters:
project_id (str) – The Google Cloud project ID.
base_path (str) – The root prefix path within the bucket where all operations will be scoped.
credentials_provider (CredentialsProvider | None) – The provider to retrieve GCS credentials.
- class multistorageclient.providers.ManifestMetadataProvider(storage_provider: StorageProvider, manifest_path: str, writable: bool = False)[source]¶
Creates a
ManifestMetadataProvider
.- Parameters:
storage_provider (StorageProvider) – Storage provider.
manifest_path (str) – Main manifest file path.
writable (bool) – If true, allows modifications and new manifests to be written.
- add_file(path: str, metadata: ObjectMetadata) None [source]¶
Add a file to be tracked by the
MetadataProvider
. Does not have to be reflected in listing until aMetadataProvider.commit_updates()
forces a persist.- Parameters:
path (str) – User-supplied path
metadata (ObjectMetadata) – file metadata
- Return type:
None
- commit_updates() None [source]¶
Commit any newly adding files, used in conjunction with
MetadataProvider.add_file()
.MetadataProvider
will persistently record any metadata changes.- Return type:
None
- get_object_metadata(path: str) ObjectMetadata [source]¶
Retrieves metadata or information about an object stored in the provider.
- Parameters:
path (str) – The path of the object.
- Returns:
A metadata object containing the information about the object.
- Return type:
- glob(pattern: str) List[str] [source]¶
Matches and retrieves a list of object keys in the storage provider that match the specified pattern.
- Parameters:
pattern (str) – The pattern to match object keys against, supporting wildcards (e.g.,
*.txt
).- Returns:
A list of object keys that match the specified pattern.
- Return type:
List[str]
- is_writable() bool [source]¶
Returns
True
if theMetadataProvider
supports writes elseFalse
.- Return type:
bool
- list_objects(prefix: str, start_after: str | None = None, end_at: str | None = None) Iterator[ObjectMetadata] [source]¶
Lists objects in the storage provider under the specified prefix.
- Parameters:
prefix (str) – The prefix or path to list objects under.
start_after (str | None) – The key to start after (i.e. exclusive). An object with this key doesn’t have to exist.
end_at (str | None) – The key to end at (i.e. inclusive). An object with this key doesn’t have to exist.
- Returns:
A iterator over objects metadata under the specified prefix.
- Return type:
Iterator[ObjectMetadata]
- realpath(path: str) Tuple[str, bool] [source]¶
Returns the canonical, full real physical path for use by a
StorageProvider
. This provides translation from user-visible paths to the canonical paths needed by aStorageProvider
.- Parameters:
path (str) – user-supplied virtual path
- Returns:
A canonical physical path and if the object at the path is valid
- Return type:
Tuple[str, bool]
- class multistorageclient.providers.OracleStorageProvider(namespace: str, base_path: str = '', credentials_provider: CredentialsProvider | None = None, **kwargs: Any)[source]¶
A concrete implementation of the
multistorageclient.types.StorageProvider
for interacting with Oracle Cloud Infrastructure (OCI) Object Storage.Initializes the
OracleStorageProvider
with the region, compartment ID, and optional credentials provider.- Parameters:
region_name – The OCI region where the Object Storage is located.
compartment_id – The OCI compartment ID for the Object Storage.
base_path (str) – The root prefix path within the bucket where all operations will be scoped.
credentials_provider (CredentialsProvider | None) – The provider to retrieve OCI credentials.
namespace (str) –
kwargs (Any) –
- class multistorageclient.providers.PosixFileStorageProvider(base_path: str, **kwargs: Any)[source]¶
- Parameters:
base_path (str) –
kwargs (Any) –
- glob(pattern: str) List[str] [source]¶
Matches and retrieves a list of object keys in the storage provider that match the specified pattern.
- Parameters:
pattern (str) – The pattern to match object keys against, supporting wildcards (e.g.,
*.txt
).- Returns:
A list of object keys that match the specified pattern.
- Return type:
List[str]
- class multistorageclient.providers.S3StorageProvider(region_name: str = '', endpoint_url: str = '', base_path: str = '', credentials_provider: CredentialsProvider | None = None, **kwargs: Any)[source]¶
A concrete implementation of the
multistorageclient.types.StorageProvider
for interacting with Amazon S3 or SwiftStack.Initializes the
S3StorageProvider
with the region, endpoint URL, and optional credentials provider.- Parameters:
region_name (str) – The AWS region where the S3 bucket is located.
endpoint_url (str) – The custom endpoint URL for the S3 service.
base_path (str) – The root prefix path within the S3 bucket where all operations will be scoped.
credentials_provider (CredentialsProvider | None) – The provider to retrieve S3 credentials.
kwargs (Any) –
- class multistorageclient.providers.StaticAISCredentialProvider(username: str | None = None, password: str | None = None, authn_endpoint: str | None = None, token: str | None = None, skip_verify: bool = True, ca_cert: str | None = None)[source]¶
A concrete implementation of the
multistorageclient.types.CredentialsProvider
that provides static S3 credentials.Initializes the
StaticAISCredentialProvider
with the given credentials.- Parameters:
username (str | None) – The username for the AIStore authentication.
password (str | None) – The password for the AIStore authentication.
authn_endpoint (str | None) – The AIStore authentication endpoint.
token (str | None) – The AIStore authentication token. This is used for authentication if username, password and authn_endpoint are not provided.
skip_verify (bool) – If true, skip SSL certificate verification.
ca_cert (str | None) – Path to a CA certificate file for SSL verification.
- get_credentials() Credentials [source]¶
Retrieves the current credentials.
- Returns:
The current credentials used for authentication.
- Return type:
- class multistorageclient.providers.StaticAzureCredentialsProvider(connection: str)[source]¶
A concrete implementation of the
multistorageclient.types.CredentialsProvider
that provides static Azure credentials.Initializes the
StaticAzureCredentialsProvider
with the provided connection string.- Parameters:
connection (str) – The connection string for Azure Blob Storage authentication.
- get_credentials() Credentials [source]¶
Retrieves the current credentials.
- Returns:
The current credentials used for authentication.
- Return type:
- class multistorageclient.providers.StaticS3CredentialsProvider(access_key: str, secret_key: str, session_token: str | None = None)[source]¶
A concrete implementation of the
multistorageclient.types.CredentialsProvider
that provides static S3 credentials.Initializes the
StaticS3CredentialsProvider
with the provided access key, secret key, and optional session token.- Parameters:
access_key (str) – The access key for S3 authentication.
secret_key (str) – The secret key for S3 authentication.
session_token (str | None) – An optional session token for temporary credentials.
- get_credentials() Credentials [source]¶
Retrieves the current credentials.
- Returns:
The current credentials used for authentication.
- Return type:
Higher-Level Libraries¶
fsspec¶
- class multistorageclient.contrib.async_fs.MultiAsyncFileSystem(*args, **kwargs)[source]¶
Custom
fsspec.asyn.AsyncFileSystem
implementation for MSC protocol (msc://
). Usesmultistorageclient.StorageClient
for backend operations.Initializes the
MultiAsyncFileSystem
.- Parameters:
kwargs – Additional arguments for the
fsspec.asyn.AsyncFileSystem
.
- static asynchronize_sync(func: Callable[[...], Any], *args: Any, **kwargs: Any) Any [source]¶
Runs a synchronous function asynchronously using asyncio.
- Parameters:
func (Callable[[...], Any]) – The synchronous function to be executed asynchronously.
args (Any) – Positional arguments to pass to the function.
kwargs (Any) – Keyword arguments to pass to the function.
- Returns:
The result of the asynchronous execution of the function.
- Return type:
Any
- cat_file(path: str, **kwargs: Any) bytes [source]¶
Reads the contents of a file at the given path.
- Parameters:
path (str) – The file path to read from.
kwargs (Any) – Additional arguments for file reading functionality.
- Returns:
The contents of the file as bytes.
- Return type:
bytes
- get_file(rpath: str, lpath: str, **kwargs: Any) None [source]¶
Downloads a file from the remote path to the local path.
- Parameters:
rpath (str) – The remote path of the file to download.
lpath (str) – The local path to store the file.
kwargs (Any) – Additional arguments for file retrieval functionality.
- Return type:
None
- glob(path: str, maxdepth: int | None = None, **kwargs: Any) List[str] [source]¶
Matches and retrieves a list of objects in the storage provider that match the specified pattern.
- Parameters:
path (str) – The pattern to match object paths against, supporting wildcards (e.g.,
*.txt
).maxdepth (int | None) – maxdepth of the pattern match
kwargs (Any) –
- Return type:
List[str]
- Returns:
A list of object paths that match the pattern.
- info(path: str, **kwargs: Any) Dict[str, Any] [source]¶
Retrieves metadata information for a file.
- Parameters:
path (str) – The file path to retrieve information for.
kwargs (Any) – Additional arguments for info functionality.
- Returns:
A dictionary containing file metadata such as ETag, last modified, and size.
- Return type:
Dict[str, Any]
- ls(path: str, detail: bool = True, **kwargs: Any) List[Dict[str, Any]] | List[str] [source]¶
Lists the contents of a directory.
- Parameters:
path (str) – The directory path to list.
detail (bool) – Whether to return detailed information for each file.
kwargs (Any) – Additional arguments for list functionality.
- Returns:
A list of file names or detailed information depending on the ‘detail’ argument.
- Return type:
List[Dict[str, Any]] | List[str]
- open(path: str, mode: str = 'rb', **kwargs: Any) PosixFile | ObjectFile [source]¶
Opens a file at the given path.
- Parameters:
path (str) – The file path to open.
mode (str) – The mode in which to open the file.
kwargs (Any) – Additional arguments for file opening.
- Returns:
A ManagedFile object representing the opened file.
- Return type:
PosixFile | ObjectFile
- pipe_file(path: str, value: bytes, **kwargs: Any) None [source]¶
Writes a value (bytes) directly to a file at the given path.
- Parameters:
path (str) – The file path to write the value to.
value (bytes) – The bytes to write to the file.
kwargs (Any) – Additional arguments for writing functionality.
- Return type:
None
- protocol: ClassVar[str | tuple[str, ...]] = 'msc'¶
- put_file(lpath: str, rpath: str, **kwargs: Any) None [source]¶
Uploads a local file to the remote path.
- Parameters:
lpath (str) – The local path of the file to upload.
rpath (str) – The remote path to store the file.
kwargs (Any) – Additional arguments for file upload functionality.
- Return type:
None
- resolve_path_and_storage_client(path: str) Tuple[StorageClient, str] [source]¶
Resolves the path and retrieves the associated
multistorageclient.StorageClient
.- Parameters:
path (str) – The file path to resolve.
- Returns:
A tuple containing the
multistorageclient.StorageClient
and the resolved path.- Return type:
Tuple[StorageClient, str]
- rm(path: str, recursive: bool = False, **kwargs: Any) None [source]¶
Removes a file or directory.
- Parameters:
path (str) – The file or directory path to remove.
recursive (bool) – If True, will remove directories and their contents recursively.
kwargs (Any) – Additional arguments for remove functionality.
- Raises:
IsADirectoryError – If the path is a directory and recursive is not set to True.
- Return type:
None
NumPy¶
- multistorageclient.contrib.numpy.load(*args: Any, **kwargs: Any) ndarray | Dict[str, ndarray] | NpzFile [source]¶
Adapt
numpy.load
.- Parameters:
args (Any) –
kwargs (Any) –
- Return type:
ndarray | Dict[str, ndarray] | NpzFile
PyTorch¶
Xarray¶
- multistorageclient.contrib.xarray.open_zarr(*args: Any, **kwargs: Any) Dataset [source]¶
Adapt
xarray.open_zarr
to usemultistorageclient.contrib.zarr.LazyZarrStore
when path matches themsc
protocol.If the path starts with the MSC protocol, it uses
multistorageclient.contrib.zarr.LazyZarrStore
with a resolved storage client and prefix, passingmsc_max_workers
if provided. Otherwise, it directly callsxarray.open_zarr
.- Parameters:
args (Any) –
kwargs (Any) –
- Return type:
Dataset
Zarr¶
- class multistorageclient.contrib.zarr.LazyZarrStore(storage_client: StorageClient, prefix: str = '', msc_max_workers: int | None = None)[source]¶
- Parameters:
storage_client (StorageClient) –
prefix (str) –
msc_max_workers (int | None) –
- getitems(keys: Sequence[str], *, contexts: Any) Mapping[str, Any] [source]¶
Retrieve data from multiple keys.
Parameters¶
- keysIterable[str]
The keys to retrieve
- contexts: Mapping[str, Context]
A mapping of keys to their context. Each context is a mapping of store specific information. E.g. a context could be a dict telling the store the preferred output array type: {“meta_array”: cupy.empty(())}
Returns¶
- Mapping
A collection mapping the input keys to their results.
Notes¶
This default implementation uses __getitem__() to read each key sequentially and ignores contexts. Overwrite this method to implement concurrent reads of multiple keys and/or to utilize the contexts.
- Parameters:
keys (Sequence[str]) –
contexts (Any) –
- Return type:
Mapping[str, Any]
- multistorageclient.contrib.zarr.open_consolidated(*args: Any, **kwargs: Any) Group [source]¶
Adapt
zarr.open_consolidated
to useLazyZarrStore
when path matches themsc
protocol.If the path starts with the MSC protocol, it uses
LazyZarrStore
with a resolved storage client and prefix, passingmsc_max_workers
if provided. Otherwise, it directly callszarr.open_consolidated
.- Parameters:
args (Any) –
kwargs (Any) –
- Return type:
Group