cub::StoreDirectWarpStriped
Defined in cub/block/block_store.cuh
-
template<typename T, int ITEMS_PER_THREAD, typename OutputIteratorT>
void cub::StoreDirectWarpStriped(int linear_tid, OutputIteratorT block_itr, T (&items)[ITEMS_PER_THREAD], int valid_items) Store a warp-striped arrangement of data across the thread block into a linear segment of items, guarded by range
Assumes a warp-striped arrangement of elements across threads, where warpi owns the ith range of (warp-threads * items-per-thread) contiguous items, and each thread owns items (i), (i + warp-threads), …, (i + (warp-threads * (items-per-thread - 1))).
Usage Considerations
The number of threads in the thread block must be a multiple of the architecture’s warp size.
- Template Parameters
T – [inferred] The data type to store.
ITEMS_PER_THREAD – [inferred] The number of consecutive items partitioned onto each thread.
OutputIteratorT – [inferred] The random-access iterator type for output (may be a simple pointer type).
- Parameters
linear_tid – [in] A suitable 1D thread-identifier for the calling thread (e.g.,
(threadIdx.y * blockDim.x) + linear_tid
for 2D thread blocks)block_itr – [in] The thread block’s base output iterator for storing to
items – [in] Data to store
valid_items – [in] Number of valid items to write