Kernel Configuration#

API#

class warp_nn.utils.config.KernelConfig(
*,
block_dim: int,
tile_1d: tuple[int],
tile_2d: tuple[int, int],
tile_3d: tuple[int, int, int],
tile_4d: tuple[int, int, int, int],
)[source]#

Bases: object

Configuration for Warp kernels generation.

block_dim: int#

Maximum number of CUDA thread blocks to use.

It only has an effect for CUDA kernel launches. If negative or zero, the maximum hardware value will be used.

tile_1d: tuple[int]#

Shape when operating with 1D tiles.

tile_2d: tuple[int, int]#

Shape when operating with 2D tiles.

tile_3d: tuple[int, int, int]#

Shape when operating with 3D tiles.

tile_4d: tuple[int, int, int, int]#

Shape when operating with 4D tiles.

warp_nn.utils.config.kernel_config(
*,
block_dim: int | None = None,
tile_1d: tuple[int] | None = None,
tile_2d: tuple[int, int] | None = None,
tile_3d: tuple[int, int, int] | None = None,
tile_4d: tuple[int, int, int, int] | None = None,
) Generator[None, None, None][source]#

Context manager that sets a thread-local configuration values.

warp_nn.utils.config.get_kernel_config() KernelConfig[source]#

Get the current configuration.