
Defined in /home/runner/work/cccl/cccl/cub/cub/util_ptx.cuh

int cub::PRMT(unsigned int a, unsigned int b, unsigned int index)


Pick four arbitrary bytes from two 32-bit registers, and reassemble them into a 32-bit destination register. For SM2.0 or later.

The bytes in the two source registers a and b are numbered from 0 to 7: {b, a} = {{b7, b6, b5, b4}, {b3, b2, b1, b0}}. For each of the four bytes {b3, b2, b1, b0} selected in the return value, a 4-bit selector is defined within the four lower “nibbles” of index: {index } = {n7, n6, n5, n4, n3, n2, n1, n0}


The code snippet below illustrates byte-permute.

#include <cub/cub.cuh>

__global__ void ExampleKernel(...)
    int a        = 0x03020100;
    int b        = 0x07060504;
    int index    = 0x00007531;

    int selected = PRMT(a, b, index);    // 0x07050301