CUDA Python#
CUDA Python is the home for accessing NVIDIA’s CUDA platform from Python. It consists of multiple components:
cuda.core: Pythonic access to CUDA Runtime and other core functionality
cuda.bindings: Low-level Python bindings to CUDA C APIs
cuda.pathfinder: Utilities for locating CUDA components installed in the user’s Python environment
cuda.coop: A Python module providing CCCL’s reusable block-wide and warp-wide device primitives for use within Numba CUDA kernels
cuda.compute: A Python module for easy access to CCCL’s highly efficient and customizable parallel algorithms, like
sort,scan,reduce,transform, etc. that are callable on the hostnumba.cuda: A Python DSL that exposes CUDA SIMT programming model and compiles a restricted subset of Python code into CUDA kernels and device functions
cuda.tile: A new Python DSL that exposes CUDA Tile programming model and allows users to write NumPy-like code in CUDA kernels
nvmath-python: Pythonic access to NVIDIA CPU & GPU Math Libraries, with host, device, and distributed APIs. It also provides low-level Python bindings to host C APIs (nvmath.bindings).
nvshmem4py: Pythonic interface to the NVSHMEM library, enabling Python applications to leverage NVSHMEM’s high-performance PGAS (Partitioned Global Address Space) programming model for GPU-accelerated computing
Nsight Python: Python kernel profiling interface that automates performance analysis across multiple kernel configurations using NVIDIA Nsight Tools
CUPTI Python: Python APIs for creation of profiling tools that target CUDA Python applications via the CUDA Profiling Tools Interface (CUPTI)
CUDA Python is currently undergoing an overhaul to improve existing and introduce new components.
All of the previously available functionality from the cuda-python package will continue to
be available, please refer to the cuda.bindings documentation for installation guide and further detail.