Optimizers and Gradients¶
Many quantum algorithms require the optimization of quantum circuit parameters with respect to an expectation value. CUDA-Q has a number of tools available for optimization techniques. This example will demonstrate how to optimize the variational parameters of a circuit using:
Built in CUDA-Q optimizers and gradients
A Third-Party Optimizer
A Parallel parameter shift gradient.
First, the kernel and Hamiltonian and specified below.
[7]:
import cudaq
from cudaq import spin
import numpy as np
hamiltonian = 5.907 - 2.1433 * spin.x(0) * spin.x(1) - 2.1433 * spin.y(
0) * spin.y(1) + .21829 * spin.z(0) - 6.125 * spin.z(1)
@cudaq.kernel
def kernel(angles: list[float]):
qubits = cudaq.qvector(2)
x(qubits[0])
ry(angles[0], qubits[1])
x.ctrl(qubits[1], qubits[0])
initial_params = np.random.normal(0, np.pi, 2)
Built in CUDA-Q Optimizers and Gradients¶
The optimizer and gradient are specified first from a built in CUDA-Q optimizer and gradient technique. An objective function is defined next which uses a lambda expression to evaluate the cost (a CUDA-Q observe
expectation value). The gradient is calculated using the compute
method.
[8]:
optimizer = cudaq.optimizers.Adam()
gradient = cudaq.gradients.CentralDifference()
def objective_function(parameter_vector: list[float],
hamiltonian=hamiltonian,
gradient_strategy=gradient,
kernel=kernel) -> tuple[float, list[float]]:
get_result = lambda parameter_vector: cudaq.observe(kernel, hamiltonian, parameter_vector).expectation()
cost = get_result(parameter_vector)
gradient_vector = gradient_strategy.compute(parameter_vector, get_result, cost)
return cost, gradient_vector
Finally, the optimizer is called to return the optimal cost and parameters.
[9]:
energy, parameter = optimizer.optimize(dimensions=1,function=objective_function)
print(f"\nminimized <H> = {round(energy,16)}")
print(f"optimal theta 0 = {round(parameter[0],16)}")
minimized <H> = -1.748382901613712
optimal theta 0 = 0.58409164053813
Third-Party Optimizers¶
The same procedure can be accomplised using any third-party such as SciPy. In this case, a simple cost fucntion is defined and provided as an input for the standard SciPy minimize
function.
[10]:
from scipy.optimize import minimize
def cost(theta):
exp_val = cudaq.observe(kernel, hamiltonian, theta).expectation()
return exp_val
result = minimize(cost, initial_params ,method='COBYLA', options={'maxiter': 40})
print(result)
message: Optimization terminated successfully.
success: True
status: 1
fun: -1.7488646919931474
x: [ 5.944e-01 1.288e+00]
nfev: 33
maxcv: 0.0
Parallel Parameter Shift Gradients¶
CUDA-Q’s mqpu
backend allows for parallel computation of parameter shift gradients using multiple simulated QPUs. Gradients computed this way can be used in any of the previously discussed optimization procedures. Below is an example demonstrating how parallel gradient evaluation can be used for a VQE procedure.
The parameter shift procedure computes two expectations values for each parameter shifted forwards and backwards. These are used to estimate the gradient contribution for that parameter.
The following code defines a function that takes a kernel, a Hamiltonian (spin operator), and the circuit parameters and produces a parameter shift gradient with shift epsilon
. The first step of the function builds xplus
and xminus
, arrays consisting of the shifted parameters.
Next, a for loop iterates over all of the parameters and uses the cudaq.observe_async
to compute the expectation value. This command also takes qpu_id
as an in out which specifies the GPU that will be used to simulate the ith QPU. In the example below, four GPUs (simulated QPUs) are available so the gradient is batched over four devices.
The results are saved in the g_plus
and g_minus
lists, the elements of which are accessed with commands like g_plus[1].expectation()
to compute the finite differences and construct the final gradient.
[11]:
import numpy as np
# cudaq.set_target('nvidia', option = 'mqpu')
num_qpus = 1
epsilon =np.pi/4
def batched_gradient_function(kernel, parameters, hamiltonian, epsilon):
#Prepare an array of parameters corresponding to the plus and minus shifts
x = np.tile(parameters, (len(parameters),1))
xplus = x + (np.eye(x.shape[0]) * epsilon)
xminus = x - (np.eye(x.shape[0]) * epsilon)
g_plus = []
g_minus = []
gradient = []
qpu_counter = 0 # Iterate over the number of GPU resources available
for i in range(x.shape[0]):
g_plus.append(cudaq.observe_async(kernel,hamiltonian, xplus[i], qpu_id = qpu_counter%num_qpus))
qpu_counter += 1
g_minus.append(cudaq.observe_async(kernel, hamiltonian, xminus[i], qpu_id = qpu_counter%num_qpus))
qpu_counter += 1
#use the expectation values to compute the gradient
gradient = [(g_plus[i].get().expectation() - g_minus[i].get().expectation()) / (2*epsilon) for i in range(len(g_minus))]
return gradient
This function can be used in a VQE procedure as presented below. First, the gradient is computed using the initial parameters, then the standard VQE construction is used, but the batched_gradient_function
is used to evaluate the gradient at each step. This objective function will return the cost and gradient at each step and can be used with any SciPy optimizer that uses a gradient.
[12]:
gradient = batched_gradient_function(kernel, initial_params, hamiltonian, epsilon)
def objective_function(parameter_vector,
hamiltonian=hamiltonian,
gradient=gradient,
kernel=kernel):
cost=cudaq.observe(kernel,hamiltonian,parameter_vector).expectation()
gradient_vector = batched_gradient_function(kernel, initial_params, hamiltonian, epsilon)
return cost, gradient_vector
[13]:
result_vqe=minimize(objective_function,initial_params, method='L-BFGS-B', jac=True, tol=1e-8, options={'maxiter': 5})
print(result)
message: Optimization terminated successfully.
success: True
status: 1
fun: -1.7488646919931474
x: [ 5.944e-01 1.288e+00]
nfev: 33
maxcv: 0.0