cuda::atomic
Defined in header <cuda/atomic>
:
template <typename T, cuda::thread_scope Scope = cuda::thread_scope_system>
class cuda::atomic;
The class template cuda::atomic
is an extended form of cuda::std::atomic
that takes an additional cuda::thread_scope
argument, defaulted to cuda::std::thread_scope_system
. It has the same interface and semantics as cuda::std::atomic
, with the following additional operations.
Atomic Fence Operations
cuda::atomic_thread_fence | Memory order and scope dependent fence synchronization primitive. (function) |
Atomic Extrema Operations
cuda::atomic::fetch_min | Atomically find the minimum of the stored value and a provided value. (member function) |
cuda::atomic::fetch_max | Atomically find the maximum of the stored value and a provided value. (member function) |
Concurrency Restrictions
An object of type cuda::atomic
or cuda::std::atomic
shall not be accessed concurrently by CPU and GPU threads unless:
- it is in unified memory and the
concurrentManagedAccess
property is 1, or - it is in CPU memory and the
hostNativeAtomicSupported
property is 1.
Note, for objects of scopes other than cuda::thread_scope_system
this is a data-race, and thefore also prohibited regardless of memory characteristics.
Under CUDA Compute Capability 6 (Pascal), an object of type atomic
may not be used:
- with automatic storage duration, or
- if
is_always_lock_free()
isfalse
.
Under CUDA Compute Capability prior to 6 (Pascal), objects of type cuda::atomic
or cuda::std::atomic
may not be used.
Implementation-Defined Behavior
For each type T
and cuda::thread_scope
S
, the value of cuda::atomic<T, S>::is_always_lock_free()
is as follows:
Type T | cuda::thread_scope S | cuda::atomic<T, S>::is_always_lock_free() |
---|---|---|
Any | Any | sizeof(T) <= 8 |
Example
#include <cuda/atomic>
__global__ void example_kernel() {
// This atomic is suitable for all threads in the system.
cuda::atomic<int, cuda::thread_scope_system> a;
// This atomic has the same type as the previous one (`a`).
cuda::atomic<int> b;
// This atomic is suitable for all threads on the current processor (e.g. GPU).
cuda::atomic<int, cuda::thread_scope_device> c;
// This atomic is suitable for threads in the same thread block.
cuda::atomic<int, cuda::thread_scope_block> d;
}