cuda.core.experimental.utils.StridedMemoryView#

class cuda.core.experimental.utils.StridedMemoryView(obj=None, stream_ptr=None)#

A class holding metadata of a strided dense array/tensor.

A StridedMemoryView instance can be created in three ways:

  1. Using the args_viewable_as_strided_memory decorator (recommended)

  2. Explicit construction relying on DLPack or CUDA Array Interface, see below.

  3. From Buffer and a StridedLayout (see from_buffer() classmethod)

StridedMemoryView(obj, stream_ptr) can be used to create a view from objects supporting either DLPack (up to v1.0) or CUDA Array Interface (CAI) v3. When wrapping an arbitrary object it will try the DLPack protocol first, then the CAI protocol. A BufferError is raised if neither is supported.

Since either way would take a consumer stream, for DLPack it is passed to obj.__dlpack__() as-is (except for None, see below); for CAI, a stream order will be established between the consumer stream and the producer stream (from obj.__cuda_array_interface__()["stream"]), as if cudaStreamWaitEvent is called by this method.

To opt-out of the stream ordering operation in either DLPack or CAI, please pass stream_ptr=-1. Note that this deviates (on purpose) from the semantics of obj.__dlpack__(stream=None, ...) since cuda.core does not encourage using the (legacy) default/null stream, but is consistent with the CAI’s semantics. For DLPack, stream=-1 will be internally passed to obj.__dlpack__() instead.

Parameters:
  • obj (Any) – Any objects that supports either DLPack (up to v1.0) or CUDA Array Interface (v3).

  • stream_ptr (int) – The pointer address (as Python int) to the consumer stream. Stream ordering will be properly established unless -1 is passed.

ptr#

Pointer to the tensor buffer (as a Python int).

Type:

int

device_id#

The device ID for where the tensor is located. It is -1 for CPU tensors (meaning those only accessible from the host).

Type:

int

is_device_accessible#

Whether the tensor data can be accessed on the GPU.

Type:

bool

readonly#

Whether the tensor data can be modified in place.

Type:

bool

exporting_obj#

A reference to the original tensor object that is being viewed. If the view is created with from_buffer(), it will be the Buffer instance passed to the method.

Type:

Any

dtype#

Optional[numpy.dtype]

Data type of the tensor.

Type:

StridedMemoryView.dtype

layout#

StridedLayout

The layout of the tensor. For StridedMemoryView created from DLPack or CAI, the layout is inferred from the tensor object’s metadata.

Type:

StridedMemoryView.layout

shape#

tuple[int]

Shape of the tensor.

Type:

StridedMemoryView.shape

strides#

Optional[tuple[int]]

Strides of the tensor (in counts, not bytes).

Type:

StridedMemoryView.strides

Methods

__init__(*args, **kwargs)#
copy_from(
self,
StridedMemoryView other: StridedMemoryView,
stream: Stream,
allocator=None,
bool blocking: bool | None = None,
)#

Copies the data from the other view into this view.

The copy can be performed between following memory spaces: host-to-device, device-to-host, device-to-device (on the same device).

The following conditions must be met:
Parameters:
  • other (StridedMemoryView) – The view to copy data from.

  • stream (Stream | None, optional) – The stream to schedule the copy on.

  • allocator (MemoryResource | None, optional) – If temporary buffers are needed, the specifed memory resources will be used to allocate the memory. If not specified, default resources will be used.

  • blocking (bool | None, optional) –

    Whether the call should block until the copy is complete.
    • True: the stream is synchronized with the host at the end of the call, blocking until the copy is complete.

    • False: if possible, the call returns immediately once the copy is scheduled. However, in some cases of host-to-device or device-to-host copies, the call may still synchronize with the host if necessary.

    • None (default):
      • for device-to-device, it defaults to False (non-blocking),

      • for host-to-device or device-to-host, it defaults to True (blocking).

copy_to(
self,
StridedMemoryView other: StridedMemoryView,
stream: Stream | None = None,
allocator=None,
bool blocking: bool | None = None,
)#

Copies the data from this view into the other view.

For details, see copy_from().

classmethod from_buffer(
cls,
buffer: Buffer,
StridedLayout layout: StridedLayout,
dtype: numpy.dtype | None = None,
bool is_readonly: bool = False,
) StridedMemoryView#

Creates a StridedMemoryView instance from a Buffer and a StridedLayout. The Buffer can be either allocation coming from a MemoryResource or an external allocation wrapped in a Buffer object with Buffer.from_handle(ptr, size, owner=...).

Hint

When allocating the memory for a given layout, the required allocation size can be obtained with the StridedLayout.required_size_in_bytes() method. It is best to use the StridedLayout.to_dense() method first to make sure the layout is contiguous, to avoid overallocating memory for layouts with gaps.

Caution

When creating a StridedMemoryView from a Buffer, no synchronization is performed. It is the user’s responsibility to ensure the data in buffer is properly synchronized when consuming the view.

Parameters:
  • buffer (Buffer) – The buffer to create the view from.

  • layout (StridedLayout) – The layout describing the shape, strides and itemsize of the elements in the buffer.

  • dtype (numpy.dtype, optional) – Optional dtype. If specified, the dtype’s itemsize must match the layout’s itemsize. To view the buffer with a different itemsize, please use StridedLayout.repacked() first to transform the layout to the desired itemsize.

  • is_readonly (bool, optional) – Whether the mark the view as readonly.

view(
self,
StridedLayout layout: StridedLayout | None = None,
dtype: numpy.dtype | None = None,
) StridedMemoryView#

Creates a new view with adjusted layout and dtype. Same as calling from_buffer() with the current buffer.