warp.tile_load#

warp.tile_load( a: Array[Any], shape: tuple[int, ...], offset: tuple[int, ...], storage: str, bounds_check: bool, ) → Tile[Any, tuple[int, ...]]#

Kernel

Differentiable

Load a tile from a global memory array.

This method will cooperatively load a tile from global memory using all threads in the block.

Parameters:

a – The source array in global memory
shape – Shape of the tile to load, must have the same number of dimensions as a
offset – Offset in the source array to begin reading from (optional)
storage – The storage location for the tile: "register" for registers (default) or "shared" for shared memory.
bounds_check – Needed for unaligned tiles, but can disable for memory-aligned tiles for faster load times

Returns:

A tile with shape as specified and data type the same as the source array.

warp.tile_load( a: Array[Any], shape: int32, offset: int32, storage: str, bounds_check: bool, ) → Tile[Any, tuple[int, ...]]

Kernel

Differentiable

Load a tile from a global memory array.