warp.launch#

warp.launch(
kernel,
dim,
inputs=[],
outputs=[],
adj_inputs=[],
adj_outputs=[],
device=None,
stream=None,
adjoint=False,
record_tape=True,
record_cmd=False,
max_blocks=0,
block_dim=256,
)[source]#

Launch a Warp kernel on the target device

Kernel launches are asynchronous with respect to the calling Python thread.

Parameters:
  • kernel – The name of a Warp kernel function, decorated with the @wp.kernel decorator

  • dim (int | Sequence[int]) – The number of threads to launch the kernel, can be an integer or a sequence of integers with a maximum of 4 dimensions.

  • inputs (Sequence) – The input parameters to the kernel (optional)

  • outputs (Sequence) – The output parameters (optional)

  • adj_inputs (Sequence) – The adjoint inputs (optional)

  • adj_outputs (Sequence) – The adjoint outputs (optional)

  • device (Device | str | None) – The device to launch on.

  • stream (Stream | None) – The stream to launch on.

  • adjoint (bool) – Whether to run forward or backward pass (typically use False).

  • record_tape (bool) – When True, the launch will be recorded the global wp.Tape() object when present.

  • record_cmd (bool) – When True, the launch will return a Launch object. The launch will not occur until the user calls Launch.launch().

  • max_blocks (int) – The maximum number of CUDA thread blocks to use. Only has an effect for CUDA kernel launches. If negative or zero, the maximum hardware value will be used.

  • block_dim (int) – The number of threads per block (always 1 for “cpu” devices).