cuda.core 1.1.0 Release Notes#

New features#

  • Added Host as the symmetric counterpart of Device for expressing managed-memory locations: Host() (any host), Host(numa_id=N) (specific NUMA node), and Host.numa_current() (calling thread’s NUMA node).

  • Added ManagedBuffer, a Buffer subclass returned by ManagedMemoryResource.allocate() that exposes a property-style advice API:

    • buf.read_mostly (bool) — driver-backed get/set.

    • buf.preferred_location (Device | Host | None) — driver-backed get/set; assigning None unsets.

    • buf.accessed_by — a live, set-like view; add() / discard() issue advice, iteration queries the driver.

    • buf.prefetch(location, *, stream), buf.discard(*, stream), buf.discard_prefetch(location, *, stream) — instance methods that delegate to the matching free functions.

    Use ManagedBuffer.from_handle() to wrap an existing managed-memory pointer.

  • Added batched managed-memory range operations to cuda.core.utils (CUDA 13+): prefetch_batch(), discard_batch(), and discard_prefetch_batch(). Each takes a sequence of managed Buffer instances and dispatches to the corresponding cuMem*BatchAsync driver entry point, addressing the managed-memory portion of #1333. Single-buffer operations are exposed as instance methods on ManagedBuffer (prefetch(), discard(), discard_prefetch()) and as property setters (read_mostly, preferred_location, accessed_by). Locations are expressed via Device or Host.