cuda.core.utils.prefetch_batch#
- cuda.core.utils.prefetch_batch(
- stream: Stream | GraphBuilder,
- buffers: Sequence[Buffer],
- locations: Device | Host | Sequence[Device | Host],
Prefetch a batch of managed-memory ranges to target locations.
Requires CUDA 13+. For a single buffer, use
ManagedBuffer.prefetch()instead.- Parameters:
stream (
Stream|GraphBuilder) – Stream for the asynchronous prefetch. First positional, required (mirrorslaunch()).buffers (Sequence[
Buffer]) – Two or more managed allocations to operate on.locations (
Device|Host| Sequence[…]) – Target location(s). A single location applies to all buffers; a sequence must matchlen(buffers).
Notes
On a CUDA 12 build, falls back to a Python-level loop calling
cuMemPrefetchAsyncper buffer (no batched driver entry point on CUDA 12). CUDA 13 builds usecuMemPrefetchBatchAsyncdirectly.