cub::CachingDeviceAllocator#
- 
struct CachingDeviceAllocator#
- A simple caching allocator for device memory allocations. - Overview
- The allocator is thread-safe and stream-safe and is capable of managing cached device allocations on multiple devices. It behaves as follows: 
- Allocations from the allocator are associated with an - active_stream. Once freed, the allocation becomes available immediately for reuse within the- active_streamwith which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to- active_streamhas completed.
- Allocations are categorized and cached by bin size. A new allocation request of a given size will only consider cached allocations within the corresponding bin. 
- Bin limits progress geometrically in accordance with the growth factor - bin_growthprovided during construction. Unused device allocations within a larger bin cache are not reused for allocation requests that categorize to smaller bin sizes.
- Allocation requests below ( - bin_growth^- min_bin) are rounded up to (- bin_growth^- min_bin).
- Allocations above ( - bin_growth^- max_bin) are not rounded up to the nearest bin and are simply freed when they are deallocated instead of being returned to a bin-cache.
- If the total storage of cached allocations on a given device will exceed - max_cached_bytes, allocations for that device are simply freed when they are deallocated instead of being returned to their bin-cache.
 
- For example, the default-constructed CachingDeviceAllocator is configured with: - bin_growth= 8
- min_bin= 3
- max_bin= 7
- max_cached_bytes= 6MB - 1B
 
- which delineates five bin-sizes: 512B, 4KB, 32KB, 256KB, and 2MB and sets a maximum of 6,291,455 cached bytes per device 
 - Public Functions - inline CachingDeviceAllocator(
- unsigned int bin_growth,
- unsigned int min_bin = 1,
- unsigned int max_bin = INVALID_BIN,
- size_t max_cached_bytes = INVALID_SIZE,
- bool skip_cleanup = false,
- Constructor. - Parameters:
- bin_growth – Geometric growth factor for bin-sizes 
- min_bin – Minimum bin (default is bin_growth ^ 1) 
- max_bin – Maximum bin (default is no max bin) 
- max_cached_bytes – Maximum aggregate cached bytes per device (default is no limit) 
- skip_cleanup – Whether or not to skip a call to - FreeAllCached()when the destructor is called (default is to deallocate)
- debug – Whether or not to print (de)allocation events to stdout (default is no stderr output) 
 
 
 
 - inline CachingDeviceAllocator(
- bool skip_cleanup = false,
- bool debug = false,
- Default constructor. - Configured with: - which delineates five bin-sizes: 512B, 4KB, 32KB, 256KB, and 2MB and sets a maximum of 6,291,455 cached bytes per device - bin_growth= 8
- min_bin= 3
- max_bin= 7
- max_cached_bytes= (- bin_growth^- max_bin) * 3 ) - 1 = 6,291,455 bytes
 
 
 
 - 
inline cudaError_t SetMaxCachedBytes(size_t max_cached_bytes_)#
- Sets the limit on the number bytes this allocator is allowed to cache per device. - Changing the ceiling of cached bytes does not cause any allocations (in-use or cached-in-reserve) to be freed. See - FreeAllCached().
 - inline cudaError_t DeviceAllocate(
- int device,
- void **d_ptr,
- size_t bytes,
- cudaStream_t active_stream = 0,
- Provides a suitable allocation of device memory for the given size on the specified device. - Once freed, the allocation becomes available immediately for reuse within the - active_streamwith which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to- active_streamhas completed.- Parameters:
- device – [in] Device on which to place the allocation 
- d_ptr – [out] Reference to pointer to the allocation 
- bytes – [in] Minimum number of bytes for the allocation 
- active_stream – [in] The stream to be associated with this allocation 
 
 
 
 - inline cudaError_t DeviceAllocate(
- void **d_ptr,
- size_t bytes,
- cudaStream_t active_stream = 0,
- Provides a suitable allocation of device memory for the given size on the current device. - Once freed, the allocation becomes available immediately for reuse within the - active_streamwith which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to- active_streamhas completed.- Parameters:
- d_ptr – [out] Reference to pointer to the allocation 
- bytes – [in] Minimum number of bytes for the allocation 
- active_stream – [in] The stream to be associated with this allocation 
 
 
 
 - 
inline cudaError_t DeviceFree(int device, void *d_ptr)#
- Frees a live allocation of device memory on the specified device, returning it to the allocator. - Once freed, the allocation becomes available immediately for reuse within the - active_streamwith which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to- active_streamhas completed.
 - 
inline cudaError_t DeviceFree(void *d_ptr)#
- Frees a live allocation of device memory on the current device, returning it to the allocator. - Once freed, the allocation becomes available immediately for reuse within the - active_streamwith which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to- active_streamhas completed.
 - 
inline cudaError_t FreeAllCached()#
- Frees all cached device allocations on all devices. 
 - 
inline virtual ~CachingDeviceAllocator()#
- Destructor.