cuda.core.utils.InMemoryProgramCache#

class cuda.core.utils.InMemoryProgramCache(*, max_size_bytes: int | None = None)#

In-memory program cache with LRU eviction.

Suitable for single-process workflows that want to avoid disk I/O – a typical application compiles its kernels once per process and looks them up many times. Entries live only for the lifetime of the process; use FileStreamProgramCache when the cache should persist across runs.

Like FileStreamProgramCache, this backend is bytes-in / bytes-out: __setitem__ accepts bytes, bytearray, memoryview, or any ObjectCode (path-backed too – the file is read at write time so the cached entry holds the binary content, not a path). __getitem__ returns bytes.

Parameters:

max_size_bytes – Optional cap on the sum of stored payload sizes. When exceeded, LRU eviction runs until the total fits. None means unbounded. The size-only bound mirrors FileStreamProgramCache.

Notes

Recency is updated on __getitem__(); get is the recommended lookup since the cache deliberately omits __contains__ (the if key in cache: ... idiom is racy across processes; see ProgramCacheResource).

Thread safety: a threading.RLock serialises every method, so the cache can be shared across threads without external locking.

Methods

__init__(
*,
max_size_bytes: int | None = None,
) None#
clear() None#

Remove every entry from the cache.

close() None#

Release backend resources.

The default implementation does nothing. Subclasses that hold long-lived state (open file handles, database connections, network sockets, …) should override this to release them.

Callers should use the context-manager form (with cache:) or call close() explicitly when finished, so code stays portable across backends that do hold resources.

get(
key: bytes | str,
default: bytes | None = None,
) bytes | None#

Return self[key] or default if absent.

update(
items: Mapping[bytes | str, bytes | bytearray | memoryview | ObjectCode] | Iterable[tuple[bytes | str, bytes | bytearray | memoryview | ObjectCode]],
/,
) None#

Bulk __setitem__.

Accepts a mapping or an iterable of (key, value) pairs. Each write goes through __setitem__ so backend-specific value coercion (e.g. extracting bytes from an ObjectCode) and size-cap enforcement run on every entry. Not transactional – a failure mid-iteration leaves earlier writes committed.