cuda::experimental::stf::place_partition#

class place_partition#

Get subsets of an execution place.

Computes a vector of execution places that partition the input place at a given granularity (see place_partition_scope). For example, a grid place can be partitioned into devices, or into green contexts, or into CUDA streams.

Use the constructors that take async_resources_handle& when partitioning at cuda_stream or green_context scope (stream and green-context resources are obtained from the handle). The constructors without a handle support only cuda_device scope. Green context scope requires CUDA 12.4 or later.

Iteration over subplaces is provided via begin() / end(); as_grid() builds an exec_place_grid from the subplaces.

Public Types

using iterator = ::std::vector<exec_place>::iterator#

Iteration over subplaces.

using const_iterator = ::std::vector<exec_place>::const_iterator#

Public Functions

inline place_partition(
exec_place place,
async_resources_handle &handle,
place_partition_scope scope
)#

Partition an execution place into a vector of subplaces (with async resource handle).

Parameters:
  • place – The execution place to partition (e.g. grid or device)

  • handle – Handle used to obtain stream or green-context resources when scope is cuda_stream or green_context

  • scope – Partitioning granularity (cuda_device, green_context, or cuda_stream)

inline place_partition(
exec_place place,
place_partition_scope scope
)#

Partition an execution place into a vector of subplaces (no async handle).

Only cuda_device scope is supported; green_context and cuda_stream require a handle.

Parameters:
  • place – The execution place to partition

  • scope – Partitioning granularity (must be cuda_device when no handle is provided)

inline place_partition(
async_resources_handle &handle,
const ::std::vector<::std::shared_ptr<exec_place>> &places,
place_partition_scope scope
)#

Partition a vector of execution places into a single vector of subplaces (with async handle).

Parameters:
  • handle – Handle for stream or green-context resources when scope is cuda_stream or green_context

  • places – Input execution places to partition

  • scope – Partitioning granularity

inline place_partition(
async_resources_handle &handle,
const exec_place_grid &grid,
place_partition_scope scope
)#

Partition a grid of execution places into a single vector of subplaces (with async handle).

Parameters:
  • handle – Handle for stream or green-context resources when scope is cuda_stream or green_context

  • grid – Input execution place grid to partition

  • scope – Partitioning granularity

inline place_partition(
const ::std::vector<::std::shared_ptr<exec_place>> &places,
place_partition_scope scope
)#

Partition a vector of execution places into a single vector of subplaces (no async handle).

Only cuda_device scope is supported.

Parameters:
  • places – Input execution places to partition

  • scope – Partitioning granularity (must be cuda_device)

~place_partition() = default#
inline iterator begin()#

Iterator to the first subplace.

Returns:

Begin iterator.

inline iterator end()#

Past-the-end iterator for subplaces.

Returns:

End iterator.

inline const_iterator begin() const#

Const iterator to the first subplace.

Returns:

Begin const iterator.

inline const_iterator end() const#

Past-the-end const iterator.

Returns:

End const iterator.

inline size_t size() const#

Number of subplaces in the partition.

Returns:

Size of the partition.

inline exec_place &get(size_t i)#

Get the i-th subplace (mutable).

Parameters:

i – Index in [0, size()).

Returns:

Reference to the i-th exec_place.

inline const exec_place &get(size_t i) const#

Get the i-th subplace (const).

Parameters:

i – Index in [0, size()).

Returns:

Const reference to the i-th exec_place.

inline exec_place_grid as_grid() const#

Build an exec_place_grid from the subplaces.

Returns:

A grid view of the partitioned execution places.