Simulations with cuQuantum¶
CUDA-Q provides support for cuQuantum-accelerated state vector and tensor network simulations. Let’s take a look at an example that is too large for a standard CPU-only simulator, but can be trivially simulated via a NVIDIA GPU-accelerated backend:
# This example is meant to demonstrate the cuQuantum
# GPU-accelerated backends and their ability to easily handle
# a larger number of qubits compared the CPU-only backend.
#
# This will take a noticeably longer time to execute on
# CPU-only backends.
import cudaq
qubit_count = 5
# We can set a larger `qubit_count` if running on a GPU backend.
# qubit_count = 28
@cudaq.kernel
def kernel(qubit_count: int):
qvector = cudaq.qvector(qubit_count)
h(qvector)
for qubit in range(qubit_count - 1):
x.ctrl(qvector[qubit], qvector[qubit + 1])
mz(qvector)
result = cudaq.sample(kernel, qubit_count, shots_count=100)
if (not cudaq.mpi.is_initialized()) or (cudaq.mpi.rank() == 0):
print(result)
Here we generate a GHZ state on 28 qubits. The built-in cuQuantum state vector
backend is selected by default if a local GPU is present. Alternatively, the
target may be manually set through the cudaq.set_target("nvidia")
command.
// Compile and run with:
// ```
// nvq++ cuquantum_backends.cpp -o dyn.x --target nvidia && ./dyn.x
// ```
// This example is meant to demonstrate the cuQuantum
// GPU-accelerated backends and their ability to easily handle
// a larger number of qubits compared the CPU-only backend.
// On CPU-only backends, this seems to hang, i.e., it takes a long
// time to handle this number of qubits.
#include <cudaq.h>
// Define a quantum kernel with a runtime parameter
struct ghz {
auto operator()(const int N) __qpu__ {
// Dynamically sized vector of qubits
cudaq::qvector q(N);
h(q[0]);
for (int i = 0; i < N - 1; i++) {
x<cudaq::ctrl>(q[i], q[i + 1]);
}
mz(q);
}
};
int main() {
auto shots_count = 1024 * 1024;
auto counts = cudaq::sample(shots_count, ghz{}, 28);
if (!cudaq::mpi::is_initialized() || cudaq::mpi::rank() == 0) {
counts.dump();
// Fine grain access to the bits and counts
for (auto &[bits, count] : counts) {
printf("Observed: %s, %lu\n", bits.data(), count);
}
}
return 0;
}
Here we generate a GHZ state on 28 qubits. To run with the built-in cuQuantum state
vector support, we pass the --target nvidia
flag at compile time:
nvq++ --target nvidia cuquantum_backends.cpp -o ghz.x
./ghz.x
Alternatively, we can set the environment variable CUDAQ_DEFAULT_SIMULATOR
to nvidia
.