cuda::experimental::places::exec_place_resources#

class exec_place_resources#

A registry of per-place stream pools keyed by exec_place::impl*.

For every distinct pooled impl pointer the registry is queried with, it owns one {compute, data} pair of stream_pools, created lazily on first lookup with sizes exec_place_default_pool_size / exec_place_default_data_pool_size.

The map itself is mutex-guarded. The mutex is only held across the find/insert into the map; subsequent stream creation (which happens lazily inside stream_pool::next) runs outside the lock, so contention is limited to slow-path task submission.

Lifetime: each entry’s pool is owned by the registry. Destroying the registry destroys every pool it has created (and their cached cudaStream_t handles). Consequently, a registry must not outlive the CUDA primary context(s) of the devices it has cached streams for; with this design, registries are typically embedded in an async_resources_handle and share the lifetime of the owning STF context.

Caveats for externally-owned places:

  • User-stream places (exec_place::cuda_stream(s)) carry their own single-stream pool and never participate in the registry.

  • Green-context places carry their own pool (constructed from the green_ctx_view) and also bypass the registry. The user must keep the underlying CUgreenCtx alive as long as the place is used.

Public Functions

exec_place_resources() = default#
exec_place_resources(const exec_place_resources&) = delete#
exec_place_resources &operator=(const exec_place_resources&) = delete#
exec_place_resources(exec_place_resources&&) = delete#
exec_place_resources &operator=(exec_place_resources&&) = delete#
inline per_place_pools &get(const void *impl_key)#

Look up (or lazily create) the {compute, data} pool slot for the supplied impl pointer.

Thread-safe: the mutex is held only across the find/insert. The returned reference is stable for the lifetime of the registry (std::unordered_map preserves node addresses across rehashes).

inline ::std::size_t size() const#

Number of per-place entries currently cached. Mainly for tests.

struct per_place_pools#

Public Functions

inline per_place_pools()#

Public Members

stream_pool compute#
stream_pool data#