cuda.core 0.2.0 Release Notes#
Released on March 17, 2025
Highlights#
Add
ProgramOptionsto facilitate the passing of runtime compile options toProgram.
Breaking Changes#
The
streamattribute is removed fromLaunchConfig. Instead, theStreamobject should now be directly passed tolaunch()as an argument.The signature for
launch()is changed by swapping positional arguments, the new signature is now(stream, config, kernel, *kernel_args)Change
__cuda_stream__from attribute to method.The
Program.compile()method no longer accepts theoptionsargument. Instead, you can optionally pass an instance ofProgramOptionsto the constructor ofProgram.Device.properties()now provides attribute getters instead of a dictionary interface.The
.handleattribute of variouscuda.coreobjects now returns the underlying Python object instead of a (type-erased) Python integer.
New features#
Expose
ObjectCodeas a public API, which allows loading cubins from memory or disk. For loading other kinds of code types, please continue usingProgram.A C++ helper function
get_cuda_native_handle()is provided in the newinclude/utility.cuhheader to retrive the underlying CUDA C objects (ex:CUstream) from a Python object returned by the.handleattribute (ex:Stream.handle).For objects such as
ProgramandLinkerthat could dispatch to different backends, a new.backendattribute is provided to query this information.Support CUDA
Eventtiming. (#481, #498, #508)An
Eventmay now be created without recording it to aStreamusing theDevice.create_event()method.Programnow supports the additionalPTXcode type. (#317)Linker.link()exceptions now include the original error log. (#423)In a systematic sweep through the cuda.core implementations, many exceptions messages were made more consistent and informative. (#458)
New examples#
jit_lto_fractal.py— Demonstrates just-in-time link-time optimization for fractal generation. (Device,LaunchConfig,Linker,LinkerOptions,Program,ProgramOptions) (#475)simple_multi_gpu_example.py— Example of using multiple GPUs. (Device,Program,LaunchConfig) (#304)show_device_properties.py— Displays detailed device properties. (Device) (#474)
Minor fixes and enhancements#
A dangling pointer problem in
_linker.pywas fixed. (#516)Add
@functools.lru_cachedecorator forget_binding_version(). (#512)Selected
.decode()were changed to.decode("utf-8", errors="backslashreplace")to ensure that decoding error messages does not abort the process. (#510)The performance of
Device.compute_capability()was improved. (#459)The
Programconstructor now issues a warning when falling back tocuLink(). (#315)To avoid deprecation warnings, the cuda.bindings imports in the cuda.core implementations were cleaned up. (#404)
Test fixes#
Clean up device initialization in some tests. (#507)