cuda-bindings 13.3.0 Release notes#
Released on May 26, 2026
Highlights#
Support for new APIs introduced in CUDA 13.3, including driver logical endpoint APIs, graph recapture APIs, NVRTC Tile IR and bundled-header APIs, and related runtime graph/event APIs. (PR #2139)
Add
cuda.bindings.cudlabindings. (PR #2034)Add the
nvvmLLVMVersionbinding. (PR #1774)Add additional NVML APIs introduced in CUDA 13.2. (PR #1830)
Bugfixes#
Fixed the
cuDevSmResourceSplitandcudaDevSmResourceSplitbinding signatures sogroupParamsis accepted as a sequence matching the CUDA API. (PR #1766)Fixed nested resource pointer handling to accept both
strandbytesinputs. (PR #1698)Fixed
nvmlDeviceGetFieldValuesandnvmlDeviceClearFieldValueshandling of empty field lists so they return empty results instead of raisingNVML_ERROR_INVALID_ARGUMENT. (PR #1982)Fixed
CUDA_PYTHON_CUDA_PER_THREAD_DEFAULT_STREAM=0incorrectly enabling per-thread default stream mode. (PR #2076)Fixed a use-after-free in
cudaGraphGetEdges,cudaGraphNodeGetDependencies,cudaGraphNodeGetDependentNodes,cudaStreamGetCaptureInfo, and their driver-API counterparts (cuGraphGetEdges,cuGraphNodeGetDependencies,cuGraphNodeGetDependentNodes,cuStreamGetCaptureInfo). The returnedcudaGraphEdgeData/CUgraphEdgeDatawrappers were backed by a scratch buffer that was freed before the call returned, leaving every wrapper holding a dangling pointer. The returned wrappers now own deep copies of the edge data. (Issue #1804, PR #2083)Fixed a double-free in the generated setters for list-valued struct members (e.g.
CUlaunchConfig.attrs,CUDA_MEM_ALLOC_NODE_PARAMS.accessDescs, external-semaphore and batch-mem-op node parameter arrays, and their runtime counterparts). Assigning an empty list freed the internal buffer but left the cached pointer non-NULL, so a subsequent assignment or__dealloc__would callfree()again on the dangling pointer. (PR #2112)
Miscellaneous#
Add
cuda.bindings.utils.check_nvvm_compiler_options()to check whether a set of NVVM compiler options is supported by the installed NVVM library. (PR #1837)NVRTC bindings now use pre-generated Cython files and no longer require pyclibrary header parsing at build time. (PR #1900)
Improved generated documentation and argument names, including the
ind_exargument naming bug. (PR #1927, PR #2082)Fixed
cuda-bindingsdebug builds. (PR #1890)Declare
cuda-pathfinderas a host dependency for pixi path-dependency builds ofcuda-bindings. (PR #1926)
Known issues#
Updating from older versions (v12.6.2.post1 and below) via
pip install -U cuda-pythonmight not work. Please do a clean re-installation by uninstallingpip uninstall -y cuda-pythonfollowed by installingpip install cuda-python.nvml.system_get_process_nameon WSL can return incorrect values. To work around this, set the locale to “C” before callingnvml.device_get_compute_running_processes_v3(which sets the process names) and before callingnvml.system_get_process_name.cuda_coredoes this automatically, but users of the raw NVML API will need to do this manually.