cuda-bindings 12.9.6 Release notes#
Released on Mar 11, 2026
Highlights#
cuda.bindings.nvmlhas graduated from experimental (cuda.bindings._nvml) to a fully supported public module with extensive handwritten Pythonic API coverage spanning ~170 functions across system queries, device discovery, memory, power, clocks, utilization, thermals, NVLink, and device configuration. (PR #1524, PR #1548)Add
nvFatbinbindings. (PR #1467)Performance improvement:
cuda.bindingsnow uses a fasterenumimplementation, rather than the standard library’senum.IntEnum. This leads to much faster import times, and slightly faster attribute access times. (PR #1581)Multiple performance improvements cumulatively reducing Python-to-C call overhead through faster
void *conversion, faster result returning, optimized enum-to-vector conversion, and stack-allocated small arrays.
Bugfixes#
Miscellaneous#
Faster
void *conversion using stack-allocated buffers instead of heap allocation. (PR #1616)Faster returning of results from driver, runtime, and NVRTC bindings. (PR #1647, PR #1656)
Faster conversion of enum sequences to vectors by eliminating temporary Python objects. (PR #1667)
Stack-allocated small numeric arrays in driver bindings, reducing heap allocation overhead. (PR #1545)
NVML bindings now use
cuda_pathfinderfor library discovery, consistent with other CUDA libraries. (PR #1661)CUDA_HOMEis no longer required at metadata resolution time (e.g.pip install --dry-run,uv lock); it is only needed at actual build time. (PR #1652)
Known issues#
Updating from older versions (v12.6.2.post1 and below) via
pip install -U cuda-pythonmight not work. Please do a clean re-installation by uninstallingpip uninstall -y cuda-pythonfollowed by installingpip install cuda-python.