Device-Wide Primitives

CUB device-level single-problem parallel algorithms:

  • cub::DeviceAdjacentDifference computes the difference between adjacent elements residing within device-accessible memory

  • cub::DeviceFor provides device-wide, parallel operations for iterating over data residing within device-accessible memory

  • cub::DeviceHistogram constructs histograms from data samples residing within device-accessible memory

  • cub::DevicePartition partitions data residing within device-accessible memory

  • cub::DeviceMergeSort sorts items residing within device-accessible memory

  • cub::DeviceRadixSort sorts items residing within device-accessible memory using radix sorting method

  • cub::DeviceReduce computes reduction of items residing within device-accessible memory

  • cub::DeviceRunLengthEncode demarcating “runs” of same-valued items withing a sequence residing within device-accessible memory

  • cub::DeviceScan computes a prefix scan across a sequence of data items residing within device-accessible memory

  • cub::DeviceSelect compacts data residing within device-accessible memory

CUB device-level segmented-problem (batched) parallel algorithms:

  • cub::DeviceSegmentedSort computes batched sort across non-overlapping sequences of data residing within device-accessible memory

  • cub::DeviceSegmentedRadixSort computes batched radix sort across non-overlapping sequences of data residing within device-accessible memory

  • cub::DeviceSegmentedReduce computes reductions across multiple sequences of data residing within device-accessible memory

  • cub::DeviceCopy provides device-wide, parallel operations for batched copying of data residing within device-accessible memory

  • cub::DeviceMemcpy provides device-wide, parallel operations for batched copying of data residing within device-accessible memory