# `cuda.core` v0.1.1 Release notes Released on Dec 20, 2024 ## Hightlights - Add `StridedMemoryView` and `@args_viewable_as_strided_memory` that provide a concrete implementation of DLPack & CUDA Array Interface supports. - Add `Linker` that can link one or multiple `ObjectCode` instances generated by `Program`. Under the hood, it uses either the nvJitLink or driver (`cuLink*`) APIs depending on the CUDA version detected in the current environment. - Support `pip install cuda-core`. Please see the Installation Guide for further details. ## New features - Add a `cuda.core.experimental.system` module for querying system- or process- wide information. - Add `LaunchConfig.cluster` to support thread block clusters on Hopper GPUs. ## Enchancements - The internal handle held by `ObjectCode` is now lazily initialized upon first touch. - Support TCC devices with a default synchronous memory resource to avoid the use of memory pools. - Ensure `"ltoir"` is a valid code type to `ObjectCode`. - Document the `__cuda_stream__` protocol. - Improve test coverage & documentation cross-references. - Enforce code formatting. ## Bug fixes - Eliminate potential class destruction issues. - Fix circular import during handling a foreign CUDA stream. ## Limitations - All APIs are currently *experimental* and subject to change without deprecation notice. Please kindly share your feedbacks with us so that we can make `cuda.core` better! - Using `cuda.core` with NVRTC or nvJitLink installed from PyPI via `pip install` is currently not supported. This will be fixed in a future release. - Some `LinkerOptions` are only available when using a modern version of CUDA. When using CUDA <12, the backend is the cuLink api which supports only a subset of the options that nvjitlink does. Further, some options aren't available on CUDA versions <12.6. - To use `cuda.core` with Python 3.13, it currently requires building `cuda-python` from source prior to `pip install`. This extra step will be fixed soon.