cuda.core.experimental.utils.StridedLayout#
- class cuda.core.experimental.utils.StridedLayout(tuple shape: tuple[int], tuple strides: tuple[int] | None, int itemsize: int, bool divide_strides: bool = False)#
A class describing the layout of a multi-dimensional tensor with a shape, strides and itemsize.
- Parameters:
shape (tuple) – A tuple of non-negative integers.
strides (tuple, optional) – If provided, must be a tuple of integers of the same length as
shape. Otherwise, the strides are assumed to be implicitly C-contiguous and the resulting layout’sstrideswill be None.itemsize (int) – The number of bytes per single element (dtype size). Must be a power of two.
divide_strides (bool, optional) – If True, the provided
strideswill be divided by theitemsize.
See also
dense().- slice_offset#
The offset (as a number of elements, not bytes) of the element at index
(0,) * ndim. See alsoslice_offset_in_bytes.- Type:
- is_contiguous_any#
True iff the layout is contiguous in some axis order, i.e. there exists a permutation of axes such that the layout is C-contiguous.
In a contiguous layout, the strides are non-negative and the mapping of elements to the memory offset range
[min_offset, max_offset]is 1-to-1.# dense defaults to C-contiguous layout = StridedLayout.dense((5, 3, 7), 1) assert layout.is_contiguous_c and not layout.is_contiguous_f assert layout.is_contiguous_any # reversing the order of axes gives F-contiguous layout permuted = layout.permuted((2, 1, 0)) assert not permuted.is_contiguous_c and permuted.is_contiguous_f assert permuted.is_contiguous_any # neither C- nor F-order but still contiguous permuted = layout.permuted((2, 0, 1)) assert not permuted.is_contiguous_c and not permuted.is_contiguous_f assert permuted.is_contiguous_any # slicing the right-most extent creates a gap in the # offset_bounds range that is not reachable with any # element in the sliced layout sliced = layout[:, :, :-1] assert not sliced.is_contiguous_c and not sliced.is_contiguous_f assert not sliced.is_contiguous_any
- Type:
- is_contiguous_c#
True iff the layout is contiguous in C-order, i.e. the rightmost stride is 1 and each subsequent stride to the left is the product of the extent and the stride to the right.
layout = StridedLayout.dense((2, 5, 3), 1, "C") assert layout == StridedLayout((2, 5, 3), (15, 3, 1), 1) assert layout.is_contiguous_c
See also
is_contiguous_any.- Type:
- is_contiguous_f#
True iff the layout is contiguous in F-order, i.e. the leftmost stride is 1 and each subsequent stride to the right is the product of the stride and extent to the left.
layout = StridedLayout.dense((2, 5, 3), 1, "F") assert layout == StridedLayout((2, 5, 3), (1, 2, 10), 1) assert layout.is_contiguous_f
See also
is_contiguous_any.- Type:
- is_dense#
A dense layout is contiguous (
is_contiguous_anyis True) and has no slice offset (slice_offset_in_bytesis 0).In a dense layout, elements are mapped 1-to-1 to the
[0, volume - 1]memory offset range.- Type:
- is_unique#
If True, each element of a tensor with this layout is mapped to a unique memory offset.
All contiguous layouts are unique and so are layouts that can be created by permuting, slicing, flattening, squeezing, repacking, or reshaping a contiguous layout. Conversely, broadcast layouts (layouts with a 0 stride for some extent greater than 1) are not unique.
For layouts resulting from manual stride manipulations (such as with
numpy.lib.stride_tricks), the check may inaccurately report False, as the exact uniqueness check may be expensive.- Type:
- max_offset#
See
offset_boundsfor details.- Type:
- min_offset#
See
offset_boundsfor details.- Type:
- offset_bounds#
The memory offset range
[min_offset, max_offset](in element counts, not bytes) that elements of a tensor with this layout are mapped to.If the layout is empty (i.e.
volume == 0), the returned tuple is(0, -1). Otherwise,min_offset <= max_offsetand all elements of the tensor with this layout are mapped within the[min_offset, max_offset]range.# Possible implementation of the offset_bounds def offset_bounds(layout : StridedLayout): if layout.volume == 0: return 0, -1 ndim = layout.ndim shape = layout.shape strides = layout.strides idx_min = [shape[i] - 1 if strides[i] < 0 else 0 for i in range(ndim)] idx_max = [shape[i] - 1 if strides[i] > 0 else 0 for i in range(ndim)] min_offset = sum(strides[i] * idx_min[i] for i in range(ndim)) + layout.slice_offset max_offset = sum(strides[i] * idx_max[i] for i in range(ndim)) + layout.slice_offset return min_offset, max_offset
- slice_offset_in_bytes#
The memory offset (as a number of bytes) of the element at index
(0,) * ndim. Equal toitemsize*slice_offset.Note
The only way for the index
(0,) * ndimto be mapped to a non-zero offset is slicing withsliced()method (or[]operator).- Type:
- stride_order#
A permutation of
tuple(range(ndim))describing the relative order of the strides.# C-contiguous layout assert StridedLayout.dense((5, 3, 7), 1).stride_order == (0, 1, 2) # F-contiguous layout assert StridedLayout.dense((5, 3, 7), 1, stride_order="F").stride_order == (2, 1, 0) # Permuted layout assert StridedLayout.dense((5, 3, 7), 1, stride_order=(2, 0, 1)).stride_order == (2, 0, 1)
- strides#
Strides of the tensor (in counts, not bytes). If StridedLayout was created with strides=None, the returned value is None and layout is implicitly C-contiguous.
Methods
- __init__(*args, **kwargs)#
- broadcast_to(self: StridedLayout, tuple shape: tuple[int]) StridedLayout#
Returns a layout with the new shape, if the old shape can be broadcast to the new one.
- The shapes are compatible if:
the new shape has the same or greater number of dimensions
starting from the right, each extent in the old shape must be 1 or equal to the corresponding extent in the new shape.
Strides of the added or modified extents are set to 0, the remaining ones are unchanged. If the shapes are not compatible, a ValueError is raised.
- classmethod dense(cls, tuple shape: tuple[int], int itemsize: int, stride_order: str | tuple[int] = 'C') StridedLayout#
Creates a new StridedLayout instance with dense strides.
- Parameters:
shape (tuple) – A tuple of non-negative integers.
itemsize (int) – The number of bytes per single element of the tensor.
stride_order (str or tuple, optional) –
- The order of the strides:
’C’ (default) - the strides are computed in C-order (increasing from the right to the left)
’F’ - the strides are computed in F-order (increasing from the left to the right)
A tuple - it must be a permutation of
tuple(range(len(shape))). The last element of the tuple is the axis with stride 1.
See also
stride_order.
assert StridedLayout.dense((5, 3, 7), 1, "C") == StridedLayout((5, 3, 7), (21, 7, 1), 1) assert StridedLayout.dense((5, 3, 7), 1, "F") == StridedLayout((5, 3, 7), (1, 5, 15), 1) assert StridedLayout.dense((5, 3, 7), 1, (2, 0, 1)) == StridedLayout((5, 3, 7), (3, 1, 15), 1)
- classmethod dense_like(
- cls,
- StridedLayout other: StridedLayout,
- stride_order: str | tuple[int] = 'K',
Creates a StridedLayout with the same
shapeanditemsizeas the other layout, but with contiguous strides in the specified order and no slice offset.See also
is_dense.- Parameters:
other (StridedLayout) – The StridedLayout to copy the
shapeanditemsizefrom.stride_order (str or tuple, optional) –
- The order of the strides:
’K’ (default) - keeps the order of the strides as in the
otherlayout.’C’ - the strides are computed in C-order (increasing from the right to the left)
’F’ - the strides are computed in F-order (increasing from the left to the right)
A tuple - it must be a permutation of
tuple(range(len(shape))). The last element of the tuple is the axis with stride 1.
See also
stride_order.
layout = StridedLayout.dense((5, 3, 7), 1).permuted((2, 0, 1)) assert layout == StridedLayout((7, 5, 3), (1, 21, 7), 1) # dense_like with the default "K" stride_order # keeps the same order of strides as in the original layout assert StridedLayout.dense_like(layout) == layout # "C", "F" recompute the strides accordingly assert StridedLayout.dense_like(layout, "C") == StridedLayout((7, 5, 3), (15, 3, 1), 1) assert StridedLayout.dense_like(layout, "F") == StridedLayout((7, 5, 3), (1, 7, 35), 1)
- flattened(
- self: StridedLayout,
- int start_axis: int = 0,
- int end_axis: int = -1,
- int mask: int | None = None,
Merges consecutive extents into a single extent (equal to the product of merged extents) if the corresponding strides can be replaced with a single stride (assuming indices are iterated in C-order, i.e. the rightmost axis is incremented first).
# the two extents can be merged into a single extent # because layout.strides[0] == layout.strides[1] * layout.shape[1] layout = StridedLayout((3, 2), (2, 1), 1) assert layout.flattened() == StridedLayout((6,), (1,), 1) # the two extents cannot be merged into a single extent # because layout.strides[0] != layout.strides[1] * layout.shape[1] layout = StridedLayout((3, 2), (1, 3), 1) assert layout.flattened() == layout
If
start_axisandend_axisare provided, only the axes in the inclusive range[start_axis, end_axis]are considered for flattening.Alternatively, a mask specifying which axes to consider can be provided. A mask of mergeable extents can be obtained using the
flattened_axis_mask()method. Masks for layouts with the same number of dimensions can be combined using the logical&(bitwise AND) operator.layout = StridedLayout.dense((4, 5, 3), 4) layout2 = StridedLayout((4, 5, 3), (1, 12, 4), 4) # Even though the two layouts have the same shape initially, # their shapes differ after flattening. assert layout.flattened() == StridedLayout((60,), (1,), 4) assert layout2.flattened() == StridedLayout((4, 15), (1, 4), 4) # With the mask, only extents that are mergeable in both layouts are flattened # and the resulting shape is the same for both layouts. mask = layout.flattened_axis_mask() & layout2.flattened_axis_mask() assert layout.flattened(mask=mask) == StridedLayout((4, 15), (15, 1), 4) assert layout2.flattened(mask=mask) == StridedLayout((4, 15), (1, 4), 4)
- flattened_axis_mask(self: StridedLayout) axes_mask_t#
A mask describing which axes of this layout are mergeable using the
flattened()method.
- max_compatible_itemsize(
- self: StridedLayout,
- int max_itemsize: int = 16,
- uintptr_t data_ptr: uintptr_t = 0,
- int axis: int = -1,
Returns the maximum itemsize (but no greater than
max_itemsize) that can be used with therepacked()method for the current layout.
- permuted(self: StridedLayout, tuple axis_order: tuple[int]) StridedLayout#
Returns a new layout where the shape and strides tuples are permuted according to the specified permutation of axes.
- repacked(
- self: StridedLayout,
- int itemsize: int,
- uintptr_t data_ptr: uintptr_t = 0,
- int axis: int = -1,
- bool keep_dim: bool = True,
Converts the layout to match the specified itemsize. If
new_itemsize < itemsize, each element of the tensor is unpacked into multiple elements, i.e. the extent ataxisincreases by the factoritemsize // new_itemsize. Ifnew_itemsize > itemsize, the consecutive elements in the tensor are packed into a single element, i.e. the extent ataxisdecreases by the factornew_itemsize // itemsize. In either case, thevolume * itemsizeof the layout remains the same.- The conversion is subject to the following constraints:
The old and new itemsizes must be powers of two.
The extent at
axismust be a positive integer.The stride at
axismust be 1.
- Moreover, if the
new_itemsize > itemsize: The extent at
axismust be divisible bynew_itemsize // itemsize.All other strides must be divisible by
new_itemsize // itemsize.The
slice_offsetmust be divisible bynew_itemsize // itemsize.If
data_ptris provided, it must be aligned to the new itemsize.
The maximum itemsize that satisfies all the constraints can be obtained using the
max_compatible_itemsize()method.If the
keep_dimis False and the extent ataxiswould be reduced to 1, it is omitted from the returned layout.# Repacking the layout with itemsize = 4 bytes as 2, 8, and 16 sized layouts. layout = StridedLayout.dense((5, 4), 4) assert layout.repacked(2) == StridedLayout.dense((5, 8), 2) assert layout.repacked(8) == StridedLayout.dense((5, 2), 8) assert layout.repacked(16) == StridedLayout.dense((5, 1), 16) assert layout.repacked(16, keep_dim=False) == StridedLayout.dense((5,), 16)
# Viewing (5, 6) float array as (5, 3) complex64 array. a = numpy.ones((5, 6), dtype=numpy.float32) float_view = StridedMemoryView(a, -1) layout = float_view.layout assert layout.shape == (5, 6) assert layout.itemsize == 4 complex_view = float_view.view(layout.repacked(8), numpy.complex64) assert complex_view.layout.shape == (5, 3) assert complex_view.layout.itemsize == 8 b = numpy.from_dlpack(complex_view) assert b.shape == (5, 3)
- required_size_in_bytes(self: StridedLayout) int#
The memory allocation size (in bytes) needed so that all elements of a tensor with this layout can be mapped within the allocated memory range.
The function raises an error if
min_offset < 0. Otherwise, the returned value is equal to(max_offset + 1) * itemsize.Hint
For dense layouts, the function always succeeds and the
(max_offset + 1) * itemsizeis equal to thevolume * itemsize.# Allocating memory on a device to copy a host tensor def device_tensor_like(a : numpy.ndarray, device : ccx.Device) -> StridedMemoryView: a_view = StridedMemoryView(a, -1) # get the original layout of ``a`` and convert it to a dense layout # to avoid overallocating memory (e.g. if the ``a`` was sliced) layout = a_view.layout.to_dense() # get the required size in bytes to fit the tensor required_size = layout.required_size_in_bytes() # allocate the memory on the device device.set_current() mem = device.allocate(required_size) # create a view on the newly allocated device memory b_view = StridedMemoryView.from_buffer(mem, layout, a_view.dtype) return b_view
- reshaped(self: StridedLayout, tuple shape: tuple[int]) StridedLayout#
Returns a layout with the new shape, if the new shape is compatible with the current layout.
- The new shape is compatible if:
the new and old shapes have the same volume
the old strides can be split or flattened to match the new shape, assuming indices are iterated in C-order
A single extent in the
shapetuple can be set to -1 to indicate it should be inferred from the old volume and the other extents.layout = StridedLayout.dense((5, 3, 4), 1) assert layout.reshaped((20, 3)) == StridedLayout.dense((20, 3), 1) assert layout.reshaped((4, -1)) == StridedLayout.dense((4, 15), 1) assert layout.permuted((2, 0, 1)).reshaped((4, 15,)) == StridedLayout((4, 15), (1, 4), 1) # layout.permuted((2, 0, 1)).reshaped((20, 3)) -> error
- sliced( ) StridedLayout#
Returns a sliced layout. The
slicesparameter can be a single integer, a singlesliceobject or a tuple of integers/slices.Hint
For convenience, instead of calling this method directly, please rely on the
__getitem__()operator (i.e. bracket syntax), e.g.:layout[:, start:end:step].Note
Slicing is purely a layout transformation and does not involve any data access.
- squeezed(self: StridedLayout) StridedLayout#
Returns a new layout where all the singleton dimensions (extents equal to 1) are removed. Additionally, if the layout volume is 0, the returned layout will be reduced to a 1-dim layout with shape (0,) and strides (0,).
- to_dense(
- self: StridedLayout,
- stride_order='K',
Returns a dense layout with the same shape and itemsize, but with dense strides in the specified order.
See
dense_like()method documentation for details.
- unsqueezed(
- self: StridedLayout,
- axis: int | tuple[int],
Returns a new layout where the specified axis or axes are added as singleton extents. The
axiscan be either a single integer in range[0, ndim]or a tuple of unique integers in range[0, ndim + len(axis) - 1].