Quick Start


CUDA Quantum streamlines hybrid application development and promotes productivity and scalability in quantum computing. It offers a unified programming model designed for a hybrid setting—that is, CPUs, GPUs, and QPUs working together. CUDA Quantum contains support for programming in Python and in C++.

This Quick Start page guides you through installing CUDA Quantum and running your first program. If you have already installed and configured CUDA Quantum, or if you are using our Docker image, you can move directly to our Basics Section. More information about working with containers and Docker alternatives can be found in our complete Installation Guide.

Install CUDA Quantum

To develop CUDA Quantum applications using Python, please follow the instructions for installing CUDA Quantum from PyPI. If you have an NVIDIA GPU, make sure to also follow the instructions for enabling GPU-acceleration.

CUDA Quantum does not require a GPU to use, but some components are GPU-accelerated. If you have access to an NVIDIA GPU, you can enable GPU-acceleration within CUDA Quantum by installing the CUDA as well as a CUDA-aware MPI implementation. We recommend using Conda to do so. If you are not already using Conda, you can install a minimal version following the instructions here. The following commands will create and activate a complete environment for CUDA Quantum with all its dependencies:

conda create -y -n cuda-quantum python=3.10 pip
conda install -y -n cuda-quantum -c "nvidia/label/cuda-11.8.0" cuda
conda install -y -n cuda-quantum -c conda-forge mpi4py openmpi cxx-compiler
conda env config vars set -n cuda-quantum LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$CONDA_PREFIX/envs/cuda-quantum/lib"
conda env config vars set -n cuda-quantum MPI_PATH=$CONDA_PREFIX/envs/cuda-quantum
conda run -n cuda-quantum pip install cuda-quantum
conda activate cuda-quantum
source $CONDA_PREFIX/lib/python3.10/site-packages/distributed_interfaces/activate_custom_mpi.sh

You must configure MPI by setting the following environment variables:

export OMPI_MCA_opal_cuda_support=true OMPI_MCA_btl='^openib'

If you do not set these variables you may encounter a segmentation fault.

Important: It is not sufficient to set these variable within the Conda environment, like the commands above do for LD_LIBRARY_PATH. To avoid having to set them every time you launch a new shell, we recommend adding them to ~/.profile (create the file if it does not exist), and to ~/.bash_profile or ~/.bash_login if such a file exists.

Once you completed the installation, please follow the instructions below to run your first CUDA Quantum program!

To develop CUDA Quantum applications using C++, please make sure you have a C++ toolchain installed that supports C++20, for example g++ version 11 or newer. Download the install_cuda_quantum file for your processor architecture from the assets of the respective GitHub release; that is the file with the aarch64 extension for ARM processors, and the one with x86_64 for, e.g., Intel and AMD processors.

To install CUDA Quantum, execute the commands

sudo -E bash install_cuda_quantum.$(uname -m) --accept
. /etc/profile

If you have an NVIDIA GPU, please also install the CUDA Toolkit to enable GPU-acceleration within CUDA Quantum.

Please see the complete installation guide for more details, including

Once you completed the installation, please follow the instructions below to run your first CUDA Quantum program!

Validate your Installation

Let’s run a simple program to validate your installation. The quantum kernel in the following program creates and measures the state \((|00\rangle + |11\rangle) / \sqrt{2}\). That means each kernel execution should either yield 00 or 11. The program samples, meaning it executes, the kernel 1000 times and prints how many times each output was measured. On average, the values 00 and 11 should be observed around 500 times each.

Create a file titled program.py, containing the following code:

import sys
import cudaq

print(f"Running on target {cudaq.get_target().name}")
qubit_count = int(sys.argv[1]) if 1 < len(sys.argv) else 2

def kernel():
    qubits = cudaq.qvector(qubit_count)
    for i in range(1, qubit_count):
        x.ctrl(qubits[0], qubits[i])

result = cudaq.sample(kernel)
print(result)  # Example: { 11:500 00:500 }

Run this program as you do any other Python program, for example:

python3 program.py

Create a file titled program.cpp, containing the following code:

#include <cudaq.h>

__qpu__ void kernel(int qubit_count) {
  cudaq::qvector qubits(qubit_count);
  for (auto i = 1; i < qubit_count; ++i) {
    cx(qubits[0], qubits[i]);

int main(int argc, char *argv[]) {
  auto qubit_count = 1 < argc ? atoi(argv[1]) : 2;
  auto result = cudaq::sample(kernel, qubit_count);
  result.dump(); // Example: { 11:500 00:500 }

Compile the program using the nvq++ compiler and run the built application with the following command:

nvq++ program.cpp -o program.x && ./program.x

If you have an NVIDIA GPU the program uses GPU acceleration by default. To confirm that this works as expected and to see the effects of GPU acceleration, you can increase the numbers of qubits the program uses to 28 and compare the time to execute the program on the nvidia target (GPU-accelerated statevector simulator) to the time when setting the target to qpp-cpu (OpenMP parallelized CPU-only statevector simulator):

python3 program.py 28 --target nvidia
nvq++ program.cpp -o program.x --target nvidia && ./program.x 28

When you change the target to qpp-cpu, the program simply seems to hang; that is because it takes a long time for the CPU-only backend to simulate 28+ qubits! Cancel the execution with Ctrl+C.

For more information about enabling GPU-acceleration, please see our complete Installation Guide. For further information on available targets, see Backends.

You are now all set to start developing quantum applications using CUDA Quantum! Please proceed to Basics for an introduction to the fundamental features of CUDA Quantum.