CUB Modules

CUB provides state-of-the-art, reusable software components for every layer of the CUDA programming model:

  • Parallel primitives

    • Warp-wide “collective” primitives

      • Cooperative warp-wide prefix scan, reduction, etc.

      • Safely specialized for each underlying CUDA architecture

    • Block-wide “collective” primitives

      • Cooperative I/O, sort, scan, reduction, histogram, etc.

      • Compatible with arbitrary thread block sizes and types

    • Device-wide primitives

      • Parallel sort, prefix scan, reduction, histogram, etc.

      • Compatible with CUDA dynamic parallelism