Introduction

Welcome to CUDA Quantum! On this page we will illustrate CUDA Quantum with several examples.

We’re going to take a look at how to construct quantum programs through CUDA Quantum’s Kernel API.

When you create a Kernel and invoke its methods, a quantum program is constructed that can then be executed by calling, for example, cudaq::sample. Let’s take a closer look!

import cudaq


# We begin by defining the `Kernel` that we will construct our
# program with.
@cudaq.kernel
def kernel():
    '''
    This is our first CUDA Quantum kernel.
    '''
    # Next, we can allocate qubits to the kernel via `qalloc(qubit_count)`.
    # An empty call to `qalloc` will return a single qubit.
    qubit = cudaq.qubit()

    # Now we can begin adding instructions to apply to this qubit!
    # Here we'll just add every non-parameterized
    # single qubit gate that is supported by CUDA Quantum.
    h(qubit)
    x(qubit)
    y(qubit)
    z(qubit)
    t(qubit)
    s(qubit)

    # Next, we add a measurement to the kernel so that we can sample
    # the measurement results on our simulator!
    mz(qubit)


# Finally, we can execute this kernel on the state vector simulator
# by calling `cudaq.sample`. This will execute the provided kernel
# `shots_count` number of times and return the sampled distribution
# as a `cudaq.SampleResult` dictionary.
result = cudaq.sample(kernel)

# Now let's take a look at the `SampleResult` we've gotten back!
print(result)

We’re going to take a look at how to construct quantum programs using CUDA Quantum kernels.

CUDA Quantum kernels are any typed callable in the language that is annotated with the __qpu__ attribute. Let’s take a look at a very simple “Hello World” example, specifically a CUDA Quantum kernel that prepares a GHZ state on a programmer-specified number of qubits.

// Compile and run with:
// ```
// nvq++ static_kernel.cpp -o ghz.x && ./ghz.x
// ```

#include <cudaq.h>

// Define a CUDA Quantum kernel that is fully specified
// at compile time via templates.
template <std::size_t N>
struct ghz {
  auto operator()() __qpu__ {

    // Compile-time sized array like std::array
    cudaq::qarray<N> q;
    h(q[0]);
    for (int i = 0; i < N - 1; i++) {
      x<cudaq::ctrl>(q[i], q[i + 1]);
    }
    mz(q);
  }
};

int main() {

  auto kernel = ghz<10>{};
  auto counts = cudaq::sample(kernel);

  if (!cudaq::mpi::is_initialized() || cudaq::mpi::rank() == 0) {
    counts.dump();

    // Fine grain access to the bits and counts
    for (auto &[bits, count] : counts) {
      printf("Observed: %s, %lu\n", bits.data(), count);
    }
  }

  return 0;
}

Here we see that we can define a custom struct that is templated on a size_t parameter. Our kernel expression is free to use this template parameter in the allocation of a compile-time-known register of qubits. Within the kernel, we are free to apply various quantum operations, like a Hadamard on qubit 0 h(q[0]). Controlled operations are modifications of single-qubit operations, like the x<cudaq::ctrl>(q[0],q[1]) operation which implements a controlled-X gate. We can measure single qubits or entire registers.

In this example we are interested in sampling the final state produced by this CUDA Quantum kernel. To do so, we leverage the generic cudaq::sample function, which returns a data type encoding the qubit measurement strings and the corresponding number of times that string was observed (here the default number of shots is used, 1000).

The following example illustrates how to compile and execute this code.

nvq++ static_kernel.cpp -o ghz.x
./ghz.x