CUDA Python

CUDA Python is the home for accessing NVIDIA’s CUDA platform from Python. It consists of multiple components:

  • cuda.core: Pythonic access to CUDA runtime and other core functionalities

  • cuda.bindings: Low-level Python bindings to CUDA C APIs

  • cuda.cooperative: A Python package providing CCCL’s reusable block-wide and warp-wide device primitives for use within Numba CUDA kernels

  • cuda.parallel: A Python package for easy access to CCCL’s highly efficient and customizable parallel algorithms, like sort, scan, reduce, transform, etc, that are callable on the host

  • numba.cuda: Numba’s target for CUDA GPU programming by directly compiling a restricted subset of Python code into CUDA kernels and device functions following the CUDA execution model.

For access to NVIDIA CPU & GPU Math Libraries, please refer to nvmath-python.

CUDA Python is currently undergoing an overhaul to improve existing and bring up new components. All of the previously available functionalities from the cuda-python package will continue to be available, please refer to the cuda.bindings documentation for installation guide and further detail.