warp.utils.AllocatorRmm#

class warp.utils.AllocatorRmm[source]#

Allocator that routes Warp device memory through RAPIDS Memory Manager (RMM).

Each allocation delegates to rmm.DeviceBuffer, which uses whichever DeviceMemoryResource is active at the time of allocation (as set by rmm.mr.set_current_device_resource()). Changing the RMM resource between allocations will affect subsequent allocations.

Requires the rmm package (Linux only, pip install rmm-cu12).

A single AllocatorRmm instance can safely be shared across multiple CUDA devices. Allocations always happen on the correct device because warp.array wraps each allocate() call in a device.context_guard. This class is not thread-safe; concurrent calls from multiple threads require external synchronization.

Each allocation is stream-ordered on the current Warp device stream, matching the pattern used by CuPy’s RMM integration. This ensures correct behavior with stream-ordered memory resources (e.g., rmm.mr.CudaAsyncMemoryResource) and during CUDA graph capture.

Example

import rmm
import warp as wp

rmm.reinitialize(pool_allocator=True, initial_pool_size=2**30)
warp.set_cuda_allocator(warp.utils.AllocatorRmm())
# All subsequent warp.array allocations go through the RMM pool
__init__()[source]#

Methods

__init__()

allocate(size_in_bytes)

Allocate device memory via RMM and return a device pointer.

deallocate(ptr, size_in_bytes)

Free device memory by releasing the RMM DeviceBuffer.

allocate(size_in_bytes)[source]#

Allocate device memory via RMM and return a device pointer.

Parameters:

size_in_bytes (int)

Return type:

int

deallocate(ptr, size_in_bytes)[source]#

Free device memory by releasing the RMM DeviceBuffer.

Parameters:
  • ptr (int)

  • size_in_bytes (int)

Return type:

None