Optimizers and Gradients

Many quantum algorithms require the optimization of quantum circuit parameters with respect to an expectation value. CUDA-Q has a number of tools available for optimization techniques. This example will demonstrate how to optimize the variational parameters of a circuit using:

  1. Built in CUDA-Q optimizers and gradients

  2. A Third-Party Optimizer

  3. A Parallel parameter shift gradient.

First, the kernel and Hamiltonian and specified below.

[7]:
import cudaq
from cudaq import spin
import numpy as np

hamiltonian = 5.907 - 2.1433 * spin.x(0) * spin.x(1) - 2.1433 * spin.y(
    0) * spin.y(1) + .21829 * spin.z(0) - 6.125 * spin.z(1)

@cudaq.kernel
def kernel(angles: list[float]):

    qubits = cudaq.qvector(2)
    x(qubits[0])
    ry(angles[0], qubits[1])
    x.ctrl(qubits[1], qubits[0])

initial_params = np.random.normal(0, np.pi, 2)

Built in CUDA-Q Optimizers and Gradients

The optimizer and gradient are specified first from a built in CUDA-Q optimizer and gradient technique. An objective function is defined next which uses a lambda expression to evaluate the cost (a CUDA-Q observe expectation value). The gradient is calculated using the compute method.

[8]:
optimizer = cudaq.optimizers.Adam()
gradient = cudaq.gradients.CentralDifference()


def objective_function(parameter_vector: list[float],
                       hamiltonian=hamiltonian,
                       gradient_strategy=gradient,
                       kernel=kernel) -> tuple[float, list[float]]:

        get_result = lambda parameter_vector: cudaq.observe(kernel, hamiltonian, parameter_vector).expectation()

        cost = get_result(parameter_vector)

        gradient_vector = gradient_strategy.compute(parameter_vector, get_result, cost)

        return cost, gradient_vector

Finally, the optimizer is called to return the optimal cost and parameters.

[9]:
energy, parameter = optimizer.optimize(dimensions=1,function=objective_function)

print(f"\nminimized <H> = {round(energy,16)}")
print(f"optimal theta 0 = {round(parameter[0],16)}")

minimized <H> = -1.748382901613712
optimal theta 0 = 0.58409164053813

Third-Party Optimizers

The same procedure can be accomplised using any third-party such as SciPy. In this case, a simple cost fucntion is defined and provided as an input for the standard SciPy minimize function.

[10]:
from scipy.optimize import minimize

def cost(theta):

    exp_val = cudaq.observe(kernel, hamiltonian, theta).expectation()

    return exp_val

result = minimize(cost, initial_params ,method='COBYLA', options={'maxiter': 40})
print(result)
 message: Optimization terminated successfully.
 success: True
  status: 1
     fun: -1.7488646919931474
       x: [ 5.944e-01  1.288e+00]
    nfev: 33
   maxcv: 0.0

Parallel Parameter Shift Gradients

CUDA-Q’s mqpu backend allows for parallel computation of parameter shift gradients using multiple simulated QPUs. Gradients computed this way can be used in any of the previously discussed optimization procedures. Below is an example demonstrating how parallel gradient evaluation can be used for a VQE procedure.

The parameter shift procedure computes two expectations values for each parameter shifted forwards and backwards. These are used to estimate the gradient contribution for that parameter.

The following code defines a function that takes a kernel, a Hamiltonian (spin operator), and the circuit parameters and produces a parameter shift gradient with shift epsilon. The first step of the function builds xplus and xminus , arrays consisting of the shifted parameters.

Next, a for loop iterates over all of the parameters and uses the cudaq.observe_async to compute the expectation value. This command also takes qpu_id as an in out which specifies the GPU that will be used to simulate the ith QPU. In the example below, four GPUs (simulated QPUs) are available so the gradient is batched over four devices.

The results are saved in the g_plus and g_minus lists, the elements of which are accessed with commands like g_plus[1].expectation() to compute the finite differences and construct the final gradient.

[11]:
import  numpy as np
# cudaq.set_target('nvidia', option = 'mqpu')

num_qpus = 1
epsilon =np.pi/4


def batched_gradient_function(kernel, parameters, hamiltonian, epsilon):

    #Prepare an array of parameters corresponding to the plus and minus shifts
    x = np.tile(parameters, (len(parameters),1))
    xplus = x + (np.eye(x.shape[0]) * epsilon)
    xminus = x - (np.eye(x.shape[0]) * epsilon)

    g_plus = []
    g_minus = []
    gradient = []

    qpu_counter = 0 # Iterate over the number of GPU resources available


    for i in range(x.shape[0]):

        g_plus.append(cudaq.observe_async(kernel,hamiltonian, xplus[i], qpu_id = qpu_counter%num_qpus))
        qpu_counter += 1

        g_minus.append(cudaq.observe_async(kernel, hamiltonian, xminus[i], qpu_id = qpu_counter%num_qpus))
        qpu_counter += 1

    #use the expectation values to compute the gradient
    gradient = [(g_plus[i].get().expectation() - g_minus[i].get().expectation()) / (2*epsilon) for i in range(len(g_minus))]

    return gradient

This function can be used in a VQE procedure as presented below. First, the gradient is computed using the initial parameters, then the standard VQE construction is used, but the batched_gradient_function is used to evaluate the gradient at each step. This objective function will return the cost and gradient at each step and can be used with any SciPy optimizer that uses a gradient.

[12]:
gradient = batched_gradient_function(kernel, initial_params, hamiltonian, epsilon)


def objective_function(parameter_vector,
                       hamiltonian=hamiltonian,
                       gradient=gradient,
                       kernel=kernel):

    cost=cudaq.observe(kernel,hamiltonian,parameter_vector).expectation()


    gradient_vector = batched_gradient_function(kernel, initial_params, hamiltonian, epsilon)

    return cost, gradient_vector

[13]:
result_vqe=minimize(objective_function,initial_params, method='L-BFGS-B', jac=True, tol=1e-8, options={'maxiter': 5})
print(result)
 message: Optimization terminated successfully.
 success: True
  status: 1
     fun: -1.7488646919931474
       x: [ 5.944e-01  1.288e+00]
    nfev: 33
   maxcv: 0.0