cuda::latch
Defined in header <cuda/latch>
:
template <cuda::thread_scope Scope>
class cuda::latch;
The class template cuda::latch
is an extended form of cuda::std::latch
takes an additional cuda::thread_scope argument.
It has the same interface and semantics as cuda::std::latch.
Concurrency Restrictions
An object of type cuda::latch
or cuda::std::latch shall not
be accessed concurrently by CPU and GPU threads unless:
it is in unified memory and the concurrentManagedAccess property is 1, or
it is in CPU memory and the hostNativeAtomicSupported property is 1.
Note, for objects of scopes other than cuda::thread_scope_system
this is a data-race, and therefore also prohibited
regardless of memory characteristics.
Under CUDA Compute Capability 6 (Pascal) or prior, an object of type cuda::latch
or
cuda::std::latch may not be used.
Implementation-Defined Behavior
For each cuda::thread_scope S
, the value of
cuda::latch<S>::max()
is as follows:
|
|
Any thread scope |
|
Example
#include <cuda/latch>
__global__ void example_kernel() {
// This latch is suitable for all threads in the system.
cuda::latch<cuda::thread_scope_system> a(10);
// This latch has the same type as the previous one (`a`).
cuda::std::latch b(10);
// This latch is suitable for all threads on the current processor (e.g. GPU).
cuda::latch<cuda::thread_scope_device> c(10);
// This latch is suitable for all threads in the same thread block.
cuda::latch<cuda::thread_scope_block> d(10);
}