6. Quantum Kernels¶
[1] To differentiate between host and quantum device code, the CUDA-Q programming model defines the concept of a quantum kernel. A quantum kernel is any callable in C++ annotated to indicate compilation and execution on an available quantum coprocessor.
[2] All quantum kernels must be annotated to indicate they are to be compiled for, and executed
on, a specified quantum coprocessor. CUDA-Q requires the __qpu__
function
attribute for quantum kernel declarations. Other language bindings may opt to use other language
features to enable function annotation or decoration (e.g. a @cudaq.kernel()
function
decorator in Python).
[3] CUDA-Q specifically differentiates between kernels invoked from host code and those invoked from within another quantum kernel. The former are denoted entry-point quantum kernels, the latter are pure device quantum kernels. Entry point kernels can only take classical types as input and return classical output. Pure-device kernels can take quantum types as input.
// Entry point lambda kernel
auto my_first_entry_point_kernel = [](double x) __qpu__ { ... quantum code ... };
// Entry point typed callable kernel
struct my_second_entry_point_kernel {
void operator()(double x, std::vector<double> params) __qpu__ {
... quantum code ...
}
};
// Pure device free function kernel
__qpu__ void my_first_pure_device_kernel(cudaq::qview<> qubits) {
... quantum code ...
}
@cudaq.kernel()
def my_first_entry_point_kernel(x : float):
... quantum code ...
@cudaq.kernel()
def my_second_entry_point_kernel(x : float, params : list[float]):
... quantum code ...
@cudaq.kernel()
def my_first_pure_device_kernel(qubits : cudaq.qview):
... quantum code ...
[4] Quantum kernel function bodies are programmed in a subset of the parent classical language. Kernels can be composed of the following:
Quantum intrinsic operations and measurements
In-scope kernel calls with or without any kernel modifiers (e.g.
cudaq::control
)Classical control flow constructs from the classical language (
if
,for
,while
, etc.)Stack variable declarations for supported types.
Arithmetic operations on integer and floating point stack variables
Coherent conditional execution -
if ( boolExprFromQubitMeasurement ) { x (another_qubit); }
Syntax for common quantum programming patterns (e.g. compute-action-uncompute).
[5] CUDA-Q defines a set of allowed types that can be leveraged in quantum kernel function signatures (input and return types), in-function variable declaration or construction, and in variable capture from parent scope.
The allowed types are as follows:
T
such thatstd::is_arithmetic_v<T> == true
std::vector<T>
such thatstd::is_arithmetic_v<T> == true
std::span<T>
such thatstd::is_arithmetic_v<T> == true
std::vector<V>
such that typeV
is a validstd::vector<T>
(possibly recursively)
struct
types composed of any valid CUDA-Q type.
std::tuple<Ts...>
such that T is any allowed type.
std::pair<T,U>
such that T is any allowed type.
struct MyCustomSimpleStruct {
int i = 0;
int j = 0;
std::vector<double> angles;
};
// Valid CUDA-Q Types used in Kernels
auto kernel = [](int N, bool flag, float angle, std::vector<std::size_t> layers,
std::vector<double> parameters, std::vector<std::vector<float>> recursiveVec,
MyCustomSimpleStruct var) __qpu__ { ... }
__qpu__ double kernelThatReturns() {
...
return M_PI_2;
}
@dataclass
class MySimpleStruct:
i : int
j : int
@cudaq.kernel
def kernel(N : int, flag : bool, angle : float, layers : list[int],
parameters : list[float], recursiveList : list[list[float]],
var : MySimpleStruct): ...
@cudaq.kernel
def kernelThatReturns() -> float:
...
return np.pi / 2.0
[6] Any variable with an allowed CUDA-Q type can be allocated on the stack within
CUDA-Q kernels. Variables of type std::vector<T>
for any allowed type
T
can only be constructed with known size. Vector-like variables cannot be
default constructed and later filled with type T
data (i.e. no dynamic memory allocation).
// Valid CUDA-Q Types used in Kernels
auto kernel = []() __qpu__ {
// Not Allowed.
// std::vector<int> i;
// i.push_back(1);
// Valid variable declarations
std::vector<int> i(5);
i[2] = 3;
std::vector<float> f {1.0, 2.0, 3.0};
int k = 0;
double pi = 3.1415926
};
@cudaq.kernel
def kernel():
# Not Allowed
# i = []
# i.append(1)
i = [0 for k in range(5)]
i[2] = 3
f = [1., 2., 3.]
k = 0
pi = 3.1415926
[7] All entry-point kernel arguments adhere to pass-by-value semantics.
auto kernel = [](int i, std::vector<double> v) __qpu__ {
// i == 2, allocate 2 qubits
cudaq::qvector q(i);
// v[1] == 2.0, angle here is 2.0
ry(v[1], q[0]);
// Change the variables, caller does not see this
i = 5;
v[0] = 3.0;
};
int k = 2;
std::vector<double> d {1.0, 2.0};
kernel(k, d);
// k is still 2, pass by value
// d is still {1.0, 2.0}, pass by value
@cudaq.kernel
def kernel(i : int, v : list[float]):
# i = 2, allocate 2 qubits
q = cudaq.qvector(i)
# v[1] == 2.0, angle here is 2.0
ry(v[1], q[0])
# Change the variables, caller does not see this
i = 5
v[0] = 3.0
k, d = 2, [1., 2.]
kernel(i, d)
# k is still 2, pass by value
# d is still {1.0, 2.0}, pass by value
[8] CUDA-Q kernel lambdas in C++ can capture variables of allowed type by value. CUDA-Q kernels defined as custom callable types can define non-reference type class members of any allowed type. These member variables can be set with pass-by-value semantics at kernel construction.
struct kernel {
int i;
float f;
void operator()() __qpu__ {
cudaq::qvector q(i);
ry(f, q[0]);
}
};
kernel{2, 2.2}();
int i = 2;
double f = 2.2;
auto kernelLambda = [=]() __qpu__ {
// Use captured variables
cudaq::qvector q(i);
ry(f, q[0]);
i = 5;
return i;
};
auto k = kernelLambda();
// Pass by value semantics
assert(k != i);
i = 2
f = np.pi / 2.
@cudaq.kernel
def kernel() -> int:
# Use captured variables
q = cudaq.qvector(i)
ry(f, q[0])
i = 5
return i
k = kernel()
# Pass by value semantics
assert k != i
[9] All quantum kernel invocations are synchronous calls by default.
[10] CUDA-Q kernels can serve as input to other quantum kernels and invoked by kernel function body code. To support CUDA-Q kernel parameterization on callable quantum kernel code, programmers can leverage standard C++ template definitions or dynamic typing in language bindings such as Python:
__qpu__ void MyStatePrep(cudaq::qview<> qubits) {
... apply state prep operations on qubits ...
}
struct MyGenericAlgorithm {
template<typename StatePrep>
void operator()(const StatePrep& statePrep) __qpu__ {
cudaq::qarray<10> q;
statePrep(q);
...
}
};
// -or- with placeholder type specifiers
struct MyGenericAlgorithm2 {
void operator()(const auto& statePrep) __qpu__ {
cudaq::qarray<10> q;
statePrep(q);
...
}
};
void callKernels() {
MyGenericAlgorithm algorithm;
algorithm(MyStatePrep);
MyGenericAlgorithm2 anotherVersion;
anotherVersion(MyStatePrep);
}
@cudaq.kernel()
def MyStatePrep(qubits : cudaq.qview):
... apply state prep operations on qubits ...
@cudaq.kernel()
def MyGenericAlgorithm(statePrep : typing.Callable[[cudaq.qview], None]):
q = cudaq.qvector(10)
statePrep(q)
...
MyGenericAlgorithm(MyStatePrep)
CUDA-Q kernel inputs can also be constrained.
namespace cudaq {
// Generic constraint on Kernel Function Signatures
template <typename Kernel, typename Signature>
concept signature = std::is_convertible_v<Kernel, std::function<Signature>>;
// Specialized for taking a single qubit
template<typename Kernel>
concept takes_qubit = signature<Kernel, void(qubit&)>;
}
struct MyGenericAlgorithmOnQarray {
void operator()(cudaq::signature<void(cudaq::qarray&)> auto&& statePrep) __qpu__ {
cudaq::qarray<10> q;
statePrep(q);
...
}
};
struct MyGenericAlgorithmOnQubit {
void operator()(cudaq::takes_qubit auto&& statePrep) __qpu__ {
cudaq::qarray<10> q;
statePrep(q[0]);
...
}
};
This approach enables the development of generic libraries of quantum algorithms that are parameterized on sub-units of the global circuit representation.