Thread-level Primitives
CUB thread-level algorithms are specialized for execution by a single thread.
cub::ThreadReduce
computes reduction of a sequence of items
CUB thread-level algorithms are specialized for execution by a single thread.
cub::ThreadReduce
computes reduction of a sequence of items