warp.tile_axpy#
- warp.tile_axpy( ) None#
Kernel
Differentiable
Scale
srcbyalphaand accumulate intodest.Performs a fused multiply-add directly into the destination tile without creating an intermediate scaled tile.
- Parameters:
alpha – Scalar multiplier (must match the tile’s underlying scalar type).
src – Source tile (must have same shape and dtype as
dest).dest – Destination tile, modified in place.
Example
@wp.kernel def compute(): dest = wp.tile_ones(dtype=float, shape=4) * 2.0 src = wp.tile_ones(dtype=float, shape=4) * 3.0 wp.tile_axpy(5.0, src, dest) print(dest) wp.launch_tiled(compute, dim=[1], inputs=[], block_dim=64)
[17 17 17 17] = tile(shape=(4), storage=register)