cub::BlockRadixRankMatch

Defined in /home/runner/work/cccl/cccl/cub/cub/block/block_radix_rank.cuh

template<int BLOCK_DIM_X, int RADIX_BITS, bool IS_DESCENDING, BlockScanAlgorithm INNER_SCAN_ALGORITHM = BLOCK_SCAN_WARP_SCANS, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int LEGACY_PTX_ARCH = 0>
class BlockRadixRankMatch

Radix-rank using match.any.

Collective constructors

inline BlockRadixRankMatch(TempStorage &temp_storage)

Collective constructor using the specified memory allocation as temporary storage.

Parameters

temp_storage[in] Reference to memory allocation having layout type TempStorage

Raking

template<int KEYS_PER_THREAD, typename CountsCallback>
inline void CallBack(CountsCallback callback)

Computes the count of keys for each digit value, and calls the callback with the array of key counts.

Template Parameters

CountsCallback – The callback type. It should implement an instance overload of operator()(int (&bins)[BINS_TRACKED_PER_THREAD]), where bins is an array of key counts for each digit value distributed in block distribution among the threads of the thread block. Key counts can be used, to update other data structures in global or shared memory. Depending on the implementation of the ranking algoirhtm (see BlockRadixRankMatchEarlyCounts), key counts may become available early, therefore, they are returned through a callback rather than a separate output parameter of RankKeys().

template<typename UnsignedBits, int KEYS_PER_THREAD, typename DigitExtractorT, typename CountsCallback>
inline void RankKeys(UnsignedBits (&keys)[KEYS_PER_THREAD], int (&ranks)[KEYS_PER_THREAD], DigitExtractorT digit_extractor, CountsCallback callback)

Rank keys.

Parameters
  • keys[in] Keys for this tile

  • ranks[out] For each key, the local rank within the tile

  • digit_extractor[in] The digit extractor

template<typename UnsignedBits, int KEYS_PER_THREAD, typename DigitExtractorT>
inline void RankKeys(UnsignedBits (&keys)[KEYS_PER_THREAD], int (&ranks)[KEYS_PER_THREAD], DigitExtractorT digit_extractor)
template<typename UnsignedBits, int KEYS_PER_THREAD, typename DigitExtractorT, typename CountsCallback>
inline void RankKeys(UnsignedBits (&keys)[KEYS_PER_THREAD], int (&ranks)[KEYS_PER_THREAD], DigitExtractorT digit_extractor, int (&exclusive_digit_prefix)[BINS_TRACKED_PER_THREAD], CountsCallback callback)

Rank keys.

For the lower RADIX_DIGITS threads, digit counts for each digit are provided for the corresponding thread.

Parameters
  • keys[in] Keys for this tile

  • ranks[out] For each key, the local rank within the tile (out parameter)

  • digit_extractor[in] The digit extractor

  • exclusive_digit_prefix[out] The exclusive prefix sum for the digits [(threadIdx.x * BINS_TRACKED_PER_THREAD) … (threadIdx.x * BINS_TRACKED_PER_THREAD) + BINS_TRACKED_PER_THREAD - 1]

template<typename UnsignedBits, int KEYS_PER_THREAD, typename DigitExtractorT>
inline void RankKeys(UnsignedBits (&keys)[KEYS_PER_THREAD], int (&ranks)[KEYS_PER_THREAD], DigitExtractorT digit_extractor, int (&exclusive_digit_prefix)[BINS_TRACKED_PER_THREAD])
Parameters
  • keys[in] Keys for this tile

  • ranks[out] For each key, the local rank within the tile (out parameter)

  • exclusive_digit_prefix[out] The exclusive prefix sum for the digits [(threadIdx.x * BINS_TRACKED_PER_THREAD) … (threadIdx.x * BINS_TRACKED_PER_THREAD) + BINS_TRACKED_PER_THREAD - 1]

Public Types

enum [anonymous]

Values:

enumerator BINS_TRACKED_PER_THREAD

Number of bin-starting offsets tracked per thread.

struct TempStorage : public Uninitialized<_TempStorage>

The operations exposed by BlockRadixRankMatch require a temporary memory allocation of this nested type for thread communication. This opaque storage can be allocated directly using the __shared__ keyword. Alternatively, it can be aliased to externally allocated memory (shared or global) or union’d with other storage allocation types to facilitate memory reuse.