warp.launch#
- warp.launch(
- kernel,
- dim,
- inputs=[],
- outputs=[],
- adj_inputs=[],
- adj_outputs=[],
- device=None,
- stream=None,
- adjoint=False,
- record_tape=True,
- record_cmd=False,
- max_blocks=0,
- block_dim=256,
Launch a Warp kernel on the target device
Kernel launches are asynchronous with respect to the calling Python thread.
- Parameters:
kernel – The name of a Warp kernel function, decorated with the
@wp.kerneldecoratordim (int | Sequence[int]) – The number of threads to launch the kernel, can be an integer or a sequence of integers with a maximum of 4 dimensions.
inputs (Sequence) – The input parameters to the kernel (optional)
outputs (Sequence) – The output parameters (optional)
adj_inputs (Sequence) – The adjoint inputs (optional)
adj_outputs (Sequence) – The adjoint outputs (optional)
stream (Stream | None) – The stream to launch on.
adjoint (bool) – Whether to run forward or backward pass (typically use
False).record_tape (bool) – When
True, the launch will be recorded the globalwp.Tape()object when present.record_cmd (bool) – When
True, the launch will return aLaunchobject. The launch will not occur until the user callsLaunch.launch().max_blocks (int) – The maximum number of CUDA thread blocks to use. Only has an effect for CUDA kernel launches. If negative or zero, the maximum hardware value will be used.
block_dim (int) – The number of threads per block (always 1 for “cpu” devices).