cuda.core.utils.ProgramCacheResource#

class cuda.core.utils.ProgramCacheResource#

Abstract base class for compiled-program caches.

Concrete implementations store and retrieve raw binary bytes keyed by bytes or str. A str key is encoded as UTF-8 before being used, so "k" and b"k" refer to the same entry. A typical key is produced by make_program_cache_key(), which returns bytes.

The values written are the compiled program bytes themselves – cubin, PTX, LTO-IR, etc. Reads return raw bytes so cache files remain consumable by external NVIDIA tools (cuobjdump, nvdisasm, cuda-gdb, …).

Most callers don’t interact with this object directly. The recommended usage is cuda.core.Program.compile()’s cache= keyword, which derives the key, returns a fresh ObjectCode on hit, and stores the compile result on miss:

with FileStreamProgramCache() as cache:
    obj = program.compile("cubin", cache=cache)

The escape hatch – only needed when the compile inputs require an extra_digest (header / PCH content fingerprints, NVVM libdevice) – is to call make_program_cache_key() yourself and use the cache as a plain bytes mapping:

from cuda.core import ObjectCode

key = make_program_cache_key(
    code=source,
    code_type="c++",
    options=options,
    target_type="cubin",
    extra_digest=header_fingerprint(),
)
data = cache.get(key)
if data is None:
    obj = program.compile("cubin")
    cache[key] = obj  # extracts bytes(obj.code)
else:
    obj = ObjectCode.from_cubin(data)

The cache layer does no payload validation; bytes go in and come back out unchanged. Symbol-mapping metadata that ObjectCode carries when produced with NVRTC name expressions is not preserved across a cache round-trip – the binary alone is stored. Callers that need symbol_mapping for get_kernel(name_expression) should compile fresh, or look the mangled symbol up by hand.

Note

Concurrent-access idiom.

Use get() (or data = cache[key] inside a try / except KeyError) for lookups. There is intentionally no __contains__: the obvious if key in cache: data = cache[key] idiom is racy across processes (another writer can os.replace over the entry, or eviction can unlink it, between the check and the read), and exposing __contains__ invites that pattern. get answers both questions in one filesystem-level operation, so a successful return always carries the bytes.

Methods

__init__()#
abstract clear() None#

Remove every entry from the cache.

close() None#

Release backend resources.

The default implementation does nothing. Subclasses that hold long-lived state (open file handles, database connections, network sockets, …) should override this to release them.

Callers should use the context-manager form (with cache:) or call close() explicitly when finished, so code stays portable across backends that do hold resources.

get(
key: bytes | str,
default: bytes | None = None,
) bytes | None#

Return self[key] or default if absent.

update(
items: Mapping[bytes | str, bytes | bytearray | memoryview | ObjectCode] | Iterable[tuple[bytes | str, bytes | bytearray | memoryview | ObjectCode]],
/,
) None#

Bulk __setitem__.

Accepts a mapping or an iterable of (key, value) pairs. Each write goes through __setitem__ so backend-specific value coercion (e.g. extracting bytes from an ObjectCode) and size-cap enforcement run on every entry. Not transactional – a failure mid-iteration leaves earlier writes committed.