warp.tile_store#

warp.tile_store(
a: Array[Any],
t: Tile[Any, tuple[int, ...]],
offset: tuple[int, ...],
bounds_check: bool,
aligned: bool,
) None#
  • Kernel

  • Differentiable

Store a tile to a global memory array.

This method will cooperatively store a tile to global memory using all threads in the block.

Parameters:
  • a – The destination array in global memory

  • t – The source tile to store data from, must have the same data type and number of dimensions as the destination array

  • offset – Offset in the destination array (optional)

  • bounds_check – Needed for unaligned tiles, but can disable for memory-aligned tiles for faster write times.

  • aligned – If True, skip runtime alignment checks for vectorized stores (shared memory, 2D+ tiles only). Has no effect for 1D tiles or register storage. Use when you guarantee that: (1) the base address at the tile offset is 16-byte aligned, (2) the array is contiguous (dense row-major strides), (3) all outer-dimension strides are multiples of 16 bytes, and (4) the tile fits entirely within array bounds. Address-alignment violations trap unconditionally (even in release builds). Bounds and contiguity violations trigger debug-only asserts; in release builds they cause silent data corruption.

warp.tile_store(
a: Array[Any],
t: Tile[Any, tuple[int, ...]],
offset: int32,
bounds_check: bool,
aligned: bool,
) None
  • Kernel

  • Differentiable

Store a tile to a global memory array.