cuda::aligned_size_t
Defined in headers <cuda/barrier>
and <cuda/pipeline>
:
template <cuda::std::size_t Alignment>
struct cuda::aligned_size_t {
static constexpr cuda::std::size_t align = Align;
cuda::std::size_t value;
__host__ __device__ explicit constexpr aligned_size(cuda::std::size_t size);
__host__ __device__ constexpr operator cuda::std::size_t();
};
The class template cuda::aligned_size_t
is a shape representing an extent of bytes with a statically
defined (address and size) alignment.
Preconditions:
The address of the extent of bytes must be aligned to an
Alignment
alignment boundary.The size of the extent of bytes must be a multiple of the
Alignment
.
Template Parameters
|
The address and size alignement of the byte extent. |
Data Members
|
The alignment of the byte extent. |
|
The size of the byte extent. |
Member Functions
|
Constructs an aligned size. If the |
|
Trivial implicit destructor. |
|
Trivial implicit copy/move. |
|
Implicit conversion to cuda::std::size_t. |
Notes
If Alignment
is not a valid alignment,
the behavior is undefined.
Example
#include <cuda/barrier>
__global__ void example_kernel(void* dst, void* src, size_t size) {
cuda::barrier<cuda::thread_scope_system> bar;
init(&bar, 1);
// Implementation cannot make assumptions about alignment.
cuda::memcpy_async(dst, src, size, bar);
// Implementation can assume that dst and src are 16-bytes aligned,
// and that size is a multiple of 16, and may optimize accordingly.
cuda::memcpy_async(dst, src, cuda::aligned_size_t<16>(size), bar);
bar.arrive_and_wait();
}