cuda::bit_reverse
template <typename T>
[[nodiscard]] constexpr T
bit_reverse(T value) noexcept;
The function reverses the order of bits in a value.
Parameters
value
: Input value
Return value
Value with reversed bits
Constraints
T
is an unsigned integer type.
Performance considerations
The function performs the following operations:
Device:
uint8_t
uint16_t
:PRMT
,BREV
uint32_t
:BREV
uint64_t
:BREV
x2,MOV
x2uint128_t
:BREV
x4,MOV
x4
Host:
__builtin_bitreverse<N>
with clang
Note
When the input values are run-time values that the compiler can resolve at compile-time, e.g. an index of a loop with a fixed number of iterations, using the function could not be optimal.
Note
GCC <= 8 uses a slow path with more instructions even in CUDA
Example
#include <cuda/bit>
#include <cuda/std/cassert>
__global__ void bit_reverse_kernel() {
assert(bitfield_reverse(0u) == ~0u);
assert(bitfield_reverse(uint8_t{0b00001011}) == uint8_t{0b11010000});
}
int main() {
bit_reverse_kernel<<<1, 1>>>();
cudaDeviceSynchronize();
return 0;
}