cache#
Cache directory management and introspection for pipeline SQLite databases.
Provides utilities to locate, list, inspect, and clean up .db files
produced by pipeline runs. The default cache location follows the
XDG Base Directory Specification and can be overridden with the
PSNC_CACHE_DIR environment variable.
Usage#
>>> from physicsnemo_curator.core.cache import default_cache_dir, list_databases
>>> cache = default_cache_dir()
>>> for info in list_databases(cache):
... print(info.hash_prefix, info.source_name, info.completed)
Attributes#
Classes#
Metadata about a single pipeline database file. |
Functions#
|
Return the total size in bytes of all |
|
Remove all |
|
Return the default cache directory for pipeline databases. |
|
List all pipeline databases in the cache directory. |
|
Remove pipeline databases matching the given hash prefixes. |
|
Remove pipeline databases older than max_age (by file mtime). |
Module Contents#
- class physicsnemo_curator.core.cache.DBInfo[source]#
Metadata about a single pipeline database file.
- Parameters:
hash_prefix (str) – Filename stem (the config hash prefix used as the DB name).
path (pathlib.Path) – Absolute path to the
.dbfile.size_bytes (int) – File size in bytes.
created (datetime) – Pipeline run start timestamp (from
pipeline_runs.started_at).source_name (str) – Registered source name extracted from the stored config JSON.
sink_name (str) – Registered sink name extracted from the stored config JSON.
filter_names (list[str]) – Registered filter names extracted from the stored config JSON.
total (int) – Total number of
index_resultsrows (completed + failed).completed (int) – Number of completed index results.
failed (int) – Number of failed index results.
- created: datetime.datetime#
- path: pathlib.Path#
- physicsnemo_curator.core.cache.cache_size(*, cache_dir: pathlib.Path | None = None) int[source]#
Return the total size in bytes of all
.dbfiles in the cache.- Parameters:
cache_dir (pathlib.Path | None, optional) – Directory to measure. Defaults to
default_cache_dir().- Returns:
Total bytes occupied by
.dbfiles, or0if the directory is empty or does not exist.- Return type:
- physicsnemo_curator.core.cache.clear_cache(*, cache_dir: pathlib.Path | None = None) int[source]#
Remove all
.dbfiles from the cache directory.- Parameters:
cache_dir (pathlib.Path | None, optional) – Directory to clear. Defaults to
default_cache_dir().- Returns:
Number of database files removed.
- Return type:
- physicsnemo_curator.core.cache.default_cache_dir() pathlib.Path[source]#
Return the default cache directory for pipeline databases.
Resolution order (highest priority first):
PSNC_CACHE_DIRenvironment variable$XDG_CACHE_HOME/psnc/~/.cache/psnc/
- Returns:
Absolute path to the cache directory (may not exist yet).
- Return type:
Examples
>>> import os >>> os.environ["PSNC_CACHE_DIR"] = "/tmp/my_cache" >>> default_cache_dir() PosixPath('/tmp/my_cache')
- physicsnemo_curator.core.cache.list_databases(cache_dir: pathlib.Path | None = None) list[DBInfo][source]#
List all pipeline databases in the cache directory.
Opens each
.dbfile, reads thepipeline_runsandindex_resultstables, and returns metadata sorted newest first (bystarted_attimestamp). Corrupt or unreadable databases are silently skipped.- Parameters:
cache_dir (pathlib.Path | None, optional) – Directory to scan. Defaults to
default_cache_dir().- Returns:
Metadata for each valid database, sorted newest first.
- Return type:
- physicsnemo_curator.core.cache.remove_databases(
- hash_prefixes: list[str],
- *,
- cache_dir: pathlib.Path | None = None,
Remove pipeline databases matching the given hash prefixes.
Each prefix is matched against
.dbfilenames (stems). A prefix that matches more than one file raisesValueErrorto prevent accidental deletion.- Parameters:
hash_prefixes (list[str]) – Hash prefix strings to match against DB file stems.
cache_dir (pathlib.Path | None, optional) – Directory to scan. Defaults to
default_cache_dir().
- Returns:
Number of database files removed.
- Return type:
- Raises:
ValueError – If a prefix is ambiguous (matches more than one
.dbfile).
- physicsnemo_curator.core.cache.remove_older_than(
- max_age: datetime.timedelta,
- *,
- cache_dir: pathlib.Path | None = None,
Remove pipeline databases older than max_age (by file mtime).
- Parameters:
max_age (timedelta) – Maximum age. Files with an mtime older than
now - max_ageare removed.cache_dir (pathlib.Path | None, optional) – Directory to scan. Defaults to
default_cache_dir().
- Returns:
Number of database files removed.
- Return type:
- physicsnemo_curator.core.cache.logger#