CUDA Python 11.6.0 Release notes¶
Released on Januray 12, 2022
Highlights¶
Support CUDA Toolkit 11.6
Support Profiler APIs
Support Graphic APIs (EGL, GL, VDPAU)
Support changing default stream
Relaxed primitive interoperability
Default stream¶
Changing default stream to Per-Thread-Default-Stream (PTDS) is done through environment variable before execution:
export CUDA_PYTHON_CUDA_PER_THREAD_DEFAULT_STREAM=1
When set to 1, the default stream is the per-thread default stream. When set to 0, the default stream is the legacy default stream. This defaults to 0, for the legacy default stream. See Stream Synchronization Behavior for an explanation of the legacy and per-thread default streams.
Primitive interoperability¶
APIs accepting classes that wrap a primitive value are now interoperable with the underlining value.
Example 1: Structure member handles interoperability.
>>> waitParams = cuda.CUstreamMemOpWaitValueParams_st()
>>> waitParams.value64 = 1
>>> waitParams.value64
<cuuint64_t 1>
>>> waitParams.value64 = cuda.cuuint64_t(2)
>>> waitParams.value64
<cuuint64_t 2>
Example 2: Function signature handles interoperability.
>>> cudart.cudaStreamQuery(cudart.cudaStreamNonBlocking)
(<cudaError_t.cudaSuccess: 0>,)
>>> cudart.cudaStreamQuery(cudart.cudaStream_t(cudart.cudaStreamNonBlocking))
(<cudaError_t.cudaSuccess: 0>,)
Limitations¶
CUDA Functions Not Supported in this Release¶
Symbol APIs
cudaGraphExecMemcpyNodeSetParamsFromSymbol
cudaGraphExecMemcpyNodeSetParamsToSymbol
cudaGraphAddMemcpyNodeToSymbol
cudaGraphAddMemcpyNodeFromSymbol
cudaGraphMemcpyNodeSetParamsToSymbol
cudaGraphMemcpyNodeSetParamsFromSymbol
cudaMemcpyToSymbol
cudaMemcpyFromSymbol
cudaMemcpyToSymbolAsync
cudaMemcpyFromSymbolAsync
cudaGetSymbolAddress
cudaGetSymbolSize
cudaGetFuncBySymbol
Launch Options
cudaLaunchKernel
cudaLaunchCooperativeKernel
cudaLaunchCooperativeKernelMultiDevice
cudaSetValidDevices
cudaVDPAUSetVDPAUDevice
Note
Deprecated APIs are removed from tracking