cuda::experimental::stf::data_place_extension#
-
class data_place_extension#
Base class for data_place extensions.
Custom data place types inherit from this class and override virtual methods to provide place-specific behavior. This enables extensibility without modifying the core data_place class.
Example usage for a custom place type:
class my_custom_extension : public data_place_extension { public: exec_place get_affine_exec_place() const override { ... } int get_device_ordinal() const override { return my_device_id; } ::std::string to_string() const override { return "my_custom_place"; } size_t hash() const override { return std::hash<int>{}(my_device_id); } bool equals(const data_place_extension& other) const override { ... } };
Subclassed by cuda::experimental::stf::green_ctx_data_place::extension
Public Functions
-
virtual ~data_place_extension() = default#
-
virtual exec_place get_affine_exec_place() const = 0#
Get the affine execution place for this data place.
Returns the exec_place that should be used for computation on data stored at this place. The exec_place may have its own virtual methods (e.g., activate/deactivate) for execution-specific behavior.
-
virtual int get_device_ordinal() const = 0#
Get the device ordinal for this place.
Returns the CUDA device ID associated with this place. For host-only places, this should return -1.
-
virtual ::std::string to_string() const = 0#
Get a string representation of this place.
Used for debugging and logging purposes.
-
virtual size_t hash() const = 0#
Compute a hash value for this place.
Used for storing data_place in hash-based containers.
-
virtual bool equals(const data_place_extension &other) const = 0#
Check equality with another extension.
- Parameters:
other – The other extension to compare with
- Returns:
true if the extensions represent the same place
- inline virtual CUresult mem_create(
- CUmemGenericAllocationHandle *handle,
- size_t size
Create a physical memory allocation for this place (VMM API)
This method is used by localized arrays (composite_slice) to create physical memory segments that are then mapped into a contiguous virtual address space. Custom place types can override this method to provide specialized memory allocation behavior.
See also
allocate() for regular memory allocation
Note
Managed memory is not supported by the VMM API.
- Parameters:
handle – Output parameter for the allocation handle
size – Size of the allocation in bytes
- Returns:
CUresult indicating success or failure
- virtual void *allocate(
- ::std::ptrdiff_t size,
- cudaStream_t stream
Allocate memory for this place (raw allocation)
This is the low-level allocation interface. For stream-ordered allocations (where allocation_is_stream_ordered() returns true), the allocation will be ordered with respect to other operations on the stream. For immediate allocations, the stream parameter is ignored.
- Parameters:
size – Size of the allocation in bytes
stream – CUDA stream for stream-ordered allocations (ignored for immediate allocations)
- Returns:
Pointer to allocated memory
- virtual void deallocate(
- void *ptr,
- size_t size,
- cudaStream_t stream
Deallocate memory for this place (raw deallocation)
- Parameters:
ptr – Pointer to memory to deallocate
size – Size of the allocation
stream – CUDA stream for stream-ordered deallocations (ignored for immediate deallocations)
-
inline virtual bool allocation_is_stream_ordered() const#
Returns true if allocation/deallocation is stream-ordered.
When this returns true, the allocation uses stream-ordered APIs like cudaMallocAsync, and allocators should use stream_async_op to synchronize prerequisites before allocation.
When this returns false, the allocation is immediate (like cudaMallocHost) and the stream parameter is ignored. Note that immediate deallocations (e.g., cudaFree) may or may not introduce implicit synchronization.
Default is true since most GPU-based extensions use cudaMallocAsync.
-
virtual ~data_place_extension() = default#