Examples#

This page links to the cuda.core examples shipped in the cuda-python repository. Use it as a quick index when you want a runnable starting point for a specific workflow.

Compilation and kernel launch#

  • vector_add.py compiles and launches a simple vector-add kernel with CuPy arrays.

  • saxpy.py JIT-compiles a templated SAXPY kernel and launches both float and double instantiations.

  • pytorch_example.py launches a CUDA kernel with PyTorch tensors and a wrapped PyTorch stream.

Multi-device and advanced launch configuration#

Linking and graphs#

  • jit_lto_fractal.py uses JIT link-time optimization to link user-provided device code into a fractal workflow at runtime.

  • cuda_graphs.py captures and replays a multi-kernel CUDA graph to reduce launch overhead.

Interoperability and memory access#

System inspection#