warp.utils.AllocatorRmm#
- class warp.utils.AllocatorRmm[source]#
Allocator that routes Warp device memory through RAPIDS Memory Manager (RMM).
Each allocation delegates to
rmm.DeviceBuffer, which uses whicheverDeviceMemoryResourceis active at the time of allocation (as set byrmm.mr.set_current_device_resource()). Changing the RMM resource between allocations will affect subsequent allocations.Requires the
rmmpackage (Linux only,pip install rmm-cu12).A single
AllocatorRmminstance can safely be shared across multiple CUDA devices. Allocations always happen on the correct device becausewarp.arraywraps eachallocate()call in adevice.context_guard. This class is not thread-safe; concurrent calls from multiple threads require external synchronization.Each allocation is stream-ordered on the current Warp device stream, matching the pattern used by CuPy’s RMM integration. This ensures correct behavior with stream-ordered memory resources (e.g.,
rmm.mr.CudaAsyncMemoryResource) and during CUDA graph capture.Example
import rmm import warp as wp rmm.reinitialize(pool_allocator=True, initial_pool_size=2**30) warp.set_cuda_allocator(warp.utils.AllocatorRmm()) # All subsequent warp.array allocations go through the RMM pool
Methods
__init__()allocate(size_in_bytes)Allocate device memory via RMM and return a device pointer.
deallocate(ptr, size_in_bytes)Free device memory by releasing the RMM DeviceBuffer.