cuda::aligned_size_t
Defined in headers <cuda/barrier>
and <cuda/pipeline>
:
template <cuda::std::size_t Alignment>
struct cuda::aligned_size_t {
static constexpr cuda::std::size_t align = Align;
cuda::std::size_t value;
__host__ __device__ explicit constexpr aligned_size(cuda::std::size_t size);
__host__ __device__ constexpr operator cuda::std::size_t();
};
The class template cuda::aligned_size_t
is a shape representing an extent of bytes with a statically defined (address and size) alignment.
Preconditions:
- The address of the extent of bytes must be aligned to an
Alignment
alignment boundary. - The size of the extent of bytes must be a multiple of the
Alignment
.
Template Parameters
Alignment | The address and size alignement of the byte extent. |
Data Members
align | The alignment of the byte extent. |
value | The size of the byte extent. |
Member Functions
(constructor) | Constructs an aligned size. If the size is not a multiple of Alignment the behavior is undefined. |
(destructor) [implicitly declared] | Trivial implicit destructor. |
operator= [implicitly declared] | Trivial implicit copy/move assignment. |
operator cuda::std::size_t | Implicit conversion to cuda::std::size_t . |
Notes
If Alignment
is not a valid alignment, the behavior is undefined.
Example
#include <cuda/barrier>
__global__ void example_kernel(void* dst, void* src, size_t size) {
cuda::barrier<cuda::thread_scope_system> bar;
init(&bar, 1);
// Implementation cannot make assumptions about alignment.
cuda::memcpy_async(dst, src, size, bar);
// Implementation can assume that dst and src are 16-bytes aligned,
// and that size is a multiple of 16, and may optimize accordingly.
cuda::memcpy_async(dst, src, cuda::aligned_size_t<16>(size), bar);
bar.arrive_and_wait();
}