cuda.core 0.2.0 Release Notes¶
Released on March 17, 2025
Highlights¶
Add
ProgramOptionsto facilitate the passing of runtime compile options toProgram.
Breaking Changes¶
The
streamattribute is removed fromLaunchConfig. Instead, theStreamobject should now be directly passed tolaunch()as an argument.The signature for
launch()is changed by swapping positional arguments, the new signature is now(stream, config, kernel, *kernel_args)Change
__cuda_stream__from attribute to method.The
Program.compile()method no longer accepts theoptionsargument. Instead, you can optionally pass an instance ofProgramOptionsto the constructor ofProgram.Device.properties()now provides attribute getters instead of a dictionary interface.The
.handleattribute of variouscuda.coreobjects now returns the underlying Python object instead of a (type-erased) Python integer.
New features¶
Expose
ObjectCodeas a public API, which allows loading cubins from memory or disk. For loading other kinds of code types, please continue usingProgram.A C++ helper function
get_cuda_native_handle()is provided in the newinclude/utility.cuhheader to retrive the underlying CUDA C objects (ex:CUstream) from a Python object returned by the.handleattribute (ex:Stream.handle).For objects such as
ProgramandLinkerthat could dispatch to different backends, a new.backendattribute is provided to query this information.Support CUDA
Eventtiming. (#481, #498, #508)An
Eventmay now be created without recording it to aStreamusing theDevice.create_event()method.Programnow supports the additionalPTXcode type. (#317)Linker.link()exceptions now include the original error log. (#423)In a systematic sweep through the cuda.core implementations, many exceptions messages were made more consistent and informative. (#458)
New examples¶
jit_lto_fractal.py— Demonstrates just-in-time link-time optimization for fractal generation. (Device,LaunchConfig,Linker,LinkerOptions,Program,ProgramOptions) (#475)simple_multi_gpu_example.py— Example of using multiple GPUs. (Device,Program,LaunchConfig) (#304)show_device_properties.py— Displays detailed device properties. (Device) (#474)
Minor fixes and enhancements¶
A dangling pointer problem in
_linker.pywas fixed. (#516)Add
@functools.lru_cachedecorator forget_binding_version(). (#512)Selected
.decode()were changed to.decode("utf-8", errors="backslashreplace")to ensure that decoding error messages does not abort the process. (#510)The performance of
Device.compute_capability()was improved. (#459)The
Programconstructor now issues a warning when falling back tocuLink(). (#315)To avoid deprecation warnings, the cuda.bindings imports in the cuda.core implementations were cleaned up. (#404)
Test fixes¶
Clean up device initialization in some tests. (#507)