cuda-bindings 12.9.6 Release notes#

Released on Mar 11, 2026

Highlights#

  • cuda.bindings.nvml has graduated from experimental (cuda.bindings._nvml) to a fully supported public module with extensive handwritten Pythonic API coverage spanning ~170 functions across system queries, device discovery, memory, power, clocks, utilization, thermals, NVLink, and device configuration. (PR #1524, PR #1548)

  • Add nvFatbin bindings. (PR #1467)

  • Performance improvement: cuda.bindings now uses a faster enum implementation, rather than the standard library’s enum.IntEnum. This leads to much faster import times, and slightly faster attribute access times. (PR #1581)

  • Multiple performance improvements cumulatively reducing Python-to-C call overhead through faster void * conversion, faster result returning, optimized enum-to-vector conversion, and stack-allocated small arrays.

Bugfixes#

  • Fixed an issue where the CU_POINTER_ATTRIBUTE_DEVICE_ORDINAL attribute was retrieved as an unsigned int, rather than a signed int. (PR #1336)

  • Fixed a use-after-free in _HelperInputVoidPtr properties when backed by Python buffer objects. (PR #1629)

Miscellaneous#

  • Faster void * conversion using stack-allocated buffers instead of heap allocation. (PR #1616)

  • Faster returning of results from driver, runtime, and NVRTC bindings. (PR #1647, PR #1656)

  • Faster conversion of enum sequences to vectors by eliminating temporary Python objects. (PR #1667)

  • Stack-allocated small numeric arrays in driver bindings, reducing heap allocation overhead. (PR #1545)

  • NVML bindings now use cuda_pathfinder for library discovery, consistent with other CUDA libraries. (PR #1661)

  • CUDA_HOME is no longer required at metadata resolution time (e.g. pip install --dry-run, uv lock); it is only needed at actual build time. (PR #1652)

Known issues#

  • Updating from older versions (v12.6.2.post1 and below) via pip install -U cuda-python might not work. Please do a clean re-installation by uninstalling pip uninstall -y cuda-python followed by installing pip install cuda-python.