cub::LoadDirectStriped

Defined in /home/runner/work/cccl/cccl/cub/cub/block/block_load.cuh

template<int BLOCK_THREADS, typename InputT, int ITEMS_PER_THREAD, typename InputIteratorT>
void cub::LoadDirectStriped(int linear_tid, InputIteratorT block_itr, InputT (&items)[ITEMS_PER_THREAD], int valid_items)

Load a linear segment of items into a striped arrangement across the thread block, guarded by range

Assumes a striped arrangement of (block-threads * items-per-thread) items across the thread block, where threadi owns items (i), (i + block-threads), …, (i + (block-threads * (items-per-thread - 1))). For multi-dimensional thread blocks, a row-major thread ordering is assumed.

Template Parameters
  • BLOCK_THREADS – The thread block size in threads

  • Tinferred The data type to load.

  • ITEMS_PER_THREADinferred The number of consecutive items partitioned onto each thread.

  • InputIteratorTinferred The random-access iterator type for input (may be a simple pointer type).

Parameters
  • linear_tid[in] A suitable 1D thread-identifier for the calling thread (e.g., (threadIdx.y * blockDim.x) + linear_tid for 2D thread blocks)

  • block_itr[in] The thread block’s base input iterator for loading from

  • items[out] Data to load

  • valid_items[in] Number of valid items to load