Simulations with cuQuantum¶
CUDA Quantum provides support for cuQuantum-accelerated state vector and tensor network simulations. Let’s take a look at an example that is too large for a standard CPU-only simulator, but can be trivially simulated via a NVIDIA GPU-accelerated backend:
# This example is meant to demonstrate the cuQuantum
# GPU-accelerated backends and their ability to easily handle
# a larger number of qubits compared the CPU-only backend.
#
# This will take a noticeably longer time to execute on
# CPU-only backends.
import cudaq
qubit_count = 28
@cudaq.kernel
def kernel(qubit_count: int):
    qvector = cudaq.qvector(qubit_count)
    h(qvector)
    for qubit in range(qubit_count - 1):
        x.ctrl(qvector[qubit], qvector[qubit + 1])
    mz(qvector)
result = cudaq.sample(kernel, qubit_count, shots_count=100)
if (not cudaq.mpi.is_initialized()) or (cudaq.mpi.rank() == 0):
    print(result)
Here we generate a GHZ state on 28 qubits. The built-in cuQuantum state vector
backend is selected by default if a local GPU is present. Alternatively, the
target may be manually set through the cudaq.set_target("nvidia") command.
// Compile and run with:
// ```
// nvq++ cuquantum_backends.cpp -o dyn.x --target nvidia && ./dyn.x
// ```
// This example is meant to demonstrate the cuQuantum
// GPU-accelerated backends and their ability to easily handle
// a larger number of qubits compared the CPU-only backend.
// On CPU-only backends, this seems to hang, i.e., it takes a long
// time to handle this number of qubits.
#include <cudaq.h>
// Define a quantum kernel with a runtime parameter
struct ghz {
  auto operator()(const int N) __qpu__ {
    // Dynamically sized vector of qubits
    cudaq::qvector q(N);
    h(q[0]);
    for (int i = 0; i < N - 1; i++) {
      x<cudaq::ctrl>(q[i], q[i + 1]);
    }
    mz(q);
  }
};
int main() {
  auto counts = cudaq::sample(/*shots=*/100, ghz{}, 28);
  if (!cudaq::mpi::is_initialized() || cudaq::mpi::rank() == 0) {
    counts.dump();
    // Fine grain access to the bits and counts
    for (auto &[bits, count] : counts) {
      printf("Observed: %s, %lu\n", bits.data(), count);
    }
  }
  return 0;
}
Here we generate a GHZ state on 28 qubits. To run with the built-in cuQuantum state
vector support, we pass the --target nvidia flag at compile time:
nvq++ --target nvidia cuquantum_backends.cpp -o ghz.x
./ghz.x
Alternatively, we can set the environment variable CUDAQ_DEFAULT_SIMULATOR to nvidia.