warp.tile_scatter_add#
- warp.tile_scatter_add( ) None#
Kernel
Differentiable
Scatter-add a per-thread value into a shared-memory tile.
Cooperative operation – all threads in the block must call this function. Each thread whose
has_valueisTrueaddsvalueat indexi.A synchronization barrier is included so the updated values are visible to all threads after the call returns.
- Parameters:
a – A shared-memory tile to scatter-add into.
i – Index of the element to add to.
value – The value to add (must match the tile’s dtype).
has_value – Whether this thread should perform the add.
atomic – If True (default), use atomic add for safe concurrent writes. Set to False when indices are guaranteed unique across threads (e.g., lane-parallel writes) for better performance.
Example
@wp.kernel def histogram(data: wp.array[float], out: wp.array[float]): bins = wp.tile_zeros(dtype=float, shape=4, storage="shared") i = wp.tid() # Bin values in [0, 8) into 4 bins of width 2 b = int(data[i] / 2.0) wp.tile_scatter_add(bins, b, 1.0, True) wp.tile_store(out, bins, offset=0) data = wp.array([0.5, 1.0, 2.5, 3.0, 4.5, 5.0, 6.5, 7.0], dtype=float) output = wp.zeros(4, dtype=float) wp.launch_tiled(histogram, dim=[1], inputs=[data, output], block_dim=8) print(output.numpy())
[2. 2. 2. 2.]
- warp.tile_scatter_add( ) None
Kernel
Differentiable