Executing Quantum Circuits

In CUDA-Q, there are 3 ways in which one can execute quantum kernels:

  1. sample: yields measurement counts

  2. observe: yields expectation values

  3. get_state: yields the quantum statevector of the computation

Sample

Quantum states collapse upon measurement and hence need to be sampled many times to gather statistics. The CUDA-Q sample call enables this:

[5]:
import cudaq
import numpy as np

qubit_count = 2

# Define the simulation target.
cudaq.set_target("qpp-cpu")

# Define a quantum kernel function.

@cudaq.kernel
def kernel(qubit_count: int):
    qvector = cudaq.qvector(qubit_count)

    # 2-qubit GHZ state.
    h(qvector[0])
    for i in range(1, qubit_count):
        x.ctrl(qvector[0], qvector[i])

    # If we dont specify measurements, all qubits are measured in
    # the Z-basis by default or we can manually specify it also
    # mz(qvector)


print(cudaq.draw(kernel, qubit_count))

result = cudaq.sample(kernel, qubit_count, shots_count=1000)

print(result)
     ╭───╮
q0 : ┤ h ├──●──
     ╰───╯╭─┴─╮
q1 : ─────┤ x ├
          ╰───╯

{ 11:506 00:494 }

Note that there is a subtle difference between how sample is executed with the target device set to a simulator or with the target device set to a QPU. In simulation mode, the quantum state is built once and then sampled \(s\) times where \(s\) equals the shots_count. In hardware execution mode, the quantum state collapses upon measurement and hence needs to be rebuilt over and over again.

There are a number of helpful tools that can be found in the API docs to process the Sample_Result object produced by sample.

Observe

The observe function allows us to calculate expectation values. We must supply a spin operator in the form of a Hamiltonian, \(H\), from which we would like to calculate \(\bra{\psi}H\ket{\psi}\).

[6]:
from cudaq import spin

# Define a Hamiltonian in terms of Pauli Spin operators.
hamiltonian = spin.z(0) + spin.y(1) + spin.x(0) * spin.z(0)

# Compute the expectation value given the state prepared by the kernel.
result = cudaq.observe(kernel, hamiltonian, qubit_count).expectation()

print('<H> =', result)
<H> = 0.0

Get state

The get_state function gives us access to the quantum statevector of the computation. Remember, that this is only feasible in simulation mode.

[7]:
# Compute the statevector of the kernel
result = cudaq.get_state(kernel, qubit_count)

print(np.array(result))
[0.70710678+0.j 0.        +0.j 0.        +0.j 0.70710678+0.j]

The statevector generated by the get_state command follows Big-endian convention for associating numbers with their binary representations, which places the least significant bit on the left. That is, for the example of a 2-bit system, we have the following translation between integers and bits:

\[\begin{split}\begin{matrix} \text{Integer} & \text{Binary representation}\\ & \text{least signinificant bit on left}\\ 0 =\textcolor{red}{0}*2^0+\textcolor{blue}{0}*2^1 & \textcolor{red}{0}\textcolor{blue}{0} \\ 1 = \textcolor{red}{1}*2^0 + \textcolor{blue}{0} *2^1 & \textcolor{red}{1}\textcolor{blue}{0}\\ 2 = \textcolor{red}{0}*2^0 + \textcolor{blue}{1}*2^1 & \textcolor{red}{0}\textcolor{blue}{1} \\ 3 = \textcolor{red}{1}*2^0 + \textcolor{blue}{1}*2^1 & \textcolor{red}{1}\textcolor{blue}{1} \end{matrix}\end{split}\]

Parallelization Techniques

The most intensive task in the computation is the execution of the quantum kernel hence each execution function: sample, observe and get_state can be parallelized given access to multiple quantum processing units (multi-QPU).

Since multi-QPU platforms are not yet feasible, we emulate each QPU with a GPU.

Observe Async

Asynchronous programming is a technique that enables your program to start a potentially long-running task and still be able to be responsive to other events while that task runs, rather than having to wait until that task has finished. Once that task has finished, your program is presented with the result.

observe can be a time intensive task. We can parallelize the execution of observe via the arguments it accepts.

[8]:
# Set the simulation target to a multi-QPU platform
# cudaq.set_target("nvidia", option = 'mqpu')

# Measuring the expectation value of 2 different hamiltonians in parallel
hamiltonian_1 = spin.x(0) + spin.y(1) + spin.z(0)*spin.y(1)
# hamiltonian_2 = spin.z(1) + spin.y(0) + spin.x(1)*spin.x(0)

# Asynchronous execution on multiple qpus via nvidia gpus.
result_1 = cudaq.observe_async(kernel, hamiltonian_1, qubit_count, qpu_id=0)
# result_2 = cudaq.observe_async(kernel, hamiltonian_2, qubit_count, qpu_id=1)

# Retrieve results
print(result_1.get().expectation())
# print(result_2.get().expectation())
2.220446049250313e-16

Above we parallelized the observe call over the hamiltonian parameter however we can parallelize over any of the argument it accepts by just iterating obver the qpu_id.

Sample Async

Similar to observe_async above, sample also supports asynchronous execution for the arguments it accepts. One can parallelize over various kernels, variational parameters or even distribute shots counts over multiple QPUs.

Get State Async

Similar to sample_async above, get_state also supports asynchronous execution for the arguments it accepts

[9]:
print(cudaq.__version__)

CUDA-Q Version  (https://github.com/NVIDIA/cuda-quantum 0eb6b444eb5b3a687e6fd64529ee9223aaa2870e)