cuda.core 0.6.0 Release Notes#

New features#

  • Added public access to default CUDA streams via module-level constants LEGACY_DEFAULT_STREAM and PER_THREAD_DEFAULT_STREAM

    Users can now access default streams directly from the cuda.core namespace:

    from cuda.core import LEGACY_DEFAULT_STREAM, PER_THREAD_DEFAULT_STREAM
    
    # Use legacy default stream (synchronizes with all blocking streams)
    LEGACY_DEFAULT_STREAM.sync()
    
    # Use per-thread default stream (non-blocking, thread-local)
    PER_THREAD_DEFAULT_STREAM.sync()
    

    The legacy default stream synchronizes with all blocking streams in the same CUDA context, ensuring strict ordering but potentially limiting concurrency. The per-thread default stream is local to the calling thread and does not synchronize with other streams, enabling concurrent execution in multi-threaded applications.

    This replaces the previous undocumented workaround of using Stream.from_handle(0) to access the legacy default stream.

Fixes and enhancements#

None.