CUDA-Q QEC - Quantum Error Correction Library
Overview
The cudaq-qec library provides a comprehensive framework for quantum
error correction research and development. It leverages GPU acceleration
for efficient syndrome decoding and error correction simulations (coming soon).
The library supports both offline analysis and real-time error correction on quantum hardware, enabling low-latency decoding for practical quantum computing applications.
Core Components
cudaq-qec is composed of three main interfaces:
QEC Codes (
cudaq::qec::code) - Define quantum error correcting codes with logical operationsDecoders (
cudaq::qec::decoder) - Implement syndrome decoding algorithmsReal-Time Decoding (
cudaq::qec::decoding) - Enable online error correction on quantum hardware
These types are meant to be extended by developers to provide new error correcting codes and decoding strategies.
QEC Code Framework cudaq::qec::code
The cudaq::qec::code class serves as the base class for all quantum error correcting codes in CUDA-Q QEC. It provides
a flexible extension point for implementing new codes and defines the core interface that all QEC codes must support.
The core abstraction here is that of a mapping or dictionary of logical operations to their corresponding physical implementation in the error correcting code as CUDA-Q quantum kernels.
Class Structure
The code base class provides:
Operation Enumeration: Defines supported logical operations
enum class operation { x, // Logical X gate y, // Logical Y gate z, // Logical Z gate h, // Logical Hadamard gate s, // Logical S gate cx, // Logical CNOT gate cy, // Logical CY gate cz, // Logical CZ gate stabilizer_round, // Stabilizer measurement round prep0, // Prepare |0⟩ state prep1, // Prepare |1⟩ state prepp, // Prepare |+⟩ state prepm // Prepare |-⟩ state };
Patch Type: Defines the structure of a logical qubit patch
struct patch { cudaq::qview<> data; // View of data qubits cudaq::qview<> ancx; // View of X stabilizer ancilla qubits cudaq::qview<> ancz; // View of Z stabilizer ancilla qubits };
The
patchtype represents a logical qubit in quantum error correction codes. It contains: -data: A view of the data qubits in the patch -ancx: A view of the ancilla qubits used for X stabilizer measurements -ancz: A view of the ancilla qubits used for Z stabilizer measurementsThis structure is designed for use within CUDA-Q kernel code and provides a convenient way to access different qubit subsets within a logical qubit patch.
Kernel Type Aliases: Defines quantum kernel signatures
using one_qubit_encoding = cudaq::qkernel<void(patch)>; using two_qubit_encoding = cudaq::qkernel<void(patch, patch)>; using stabilizer_round = cudaq::qkernel<std::vector<cudaq::measure_result>( patch, const std::vector<std::size_t>&, const std::vector<std::size_t>&)>;
Protected Members:
operation_encodings: Maps operations to their quantum kernel implementations. The key is theoperationenum and the value is a variant on the above kernel type aliases.m_stabilizers: Stores the code’s stabilizer generators
Implementing a New Code
To implement a new quantum error correcting code:
Create a New Class:
class my_code : public qec::code { protected: // Implement required virtual methods public: my_code(const heterogeneous_map& options); };
Implement Required Virtual Methods:
// Number of physical data qubits std::size_t get_num_data_qubits() const override; // Total number of ancilla qubits std::size_t get_num_ancilla_qubits() const override; // Number of X-type ancilla qubits std::size_t get_num_ancilla_x_qubits() const override; // Number of Z-type ancilla qubits std::size_t get_num_ancilla_z_qubits() const override;
Define Quantum Kernels:
Create CUDA-Q kernels for each logical operation:
__qpu__ void x(patch p) { // Implement logical X } __qpu__ std::vector<cudaq::measure_result> stabilizer(patch p, const std::vector<std::size_t>& x_stabs, const std::vector<std::size_t>& z_stabs) { // Implement stabilizer measurements }
Register Operations:
In the constructor, register quantum kernels for each operation:
my_code::my_code(const heterogeneous_map& options) : code() { // Register operations operation_encodings.insert( std::make_pair(operation::x, x)); operation_encodings.insert( std::make_pair(operation::stabilizer_round, stabilizer)); // Define stabilizer generators m_stabilizers = qec::stabilizers({"XXXX", "ZZZZ"}); }
Note that in your constructor, you have access to user-provided
options. For example, if your code depends on an integer parameter calleddistance, you can retrieve that from the user viamy_code::my_code(const heterogeneous_map& options) : code() { // ... fill the map and stabilizers ... // Get the user-provided distance, or just // set to 3 if user did not provide one this->distance = options.get<int>("distance", /*defaultValue*/ 3); }
Register Extension Point:
Add extension point registration:
CUDAQ_EXTENSION_CUSTOM_CREATOR_FUNCTION( my_code, static std::unique_ptr<qec::code> create( const heterogeneous_map &options) { return std::make_unique<my_code>(options); } ) CUDAQ_REGISTER_TYPE(my_code)
Example: Steane Code
The Steane [[7,1,3]] code provides a complete example implementation:
Header Definition:
Declares quantum kernels for all logical operations
Defines the code class with required virtual methods
Specifies 7 data qubits and 6 ancilla qubits (3 X-type, 3 Z-type)
Implementation:
steane::steane(const heterogeneous_map &options) : code() { // Register all logical operations operation_encodings.insert( std::make_pair(operation::x, x)); // ... register other operations ... // Define stabilizer generators m_stabilizers = qec::stabilizers({ "XXXXIII", "IXXIXXI", "IIXXIXX", "ZZZZIII", "IZZIZZI", "IIZZIZZ" }); }
Quantum Kernels:
Implements fault-tolerant logical operations:
__qpu__ void x(patch logicalQubit) { // Apply logical X to specific data qubits x(logicalQubit.data[4], logicalQubit.data[5], logicalQubit.data[6]); } __qpu__ std::vector<cudaq::measure_result> stabilizer(patch logicalQubit, const std::vector<std::size_t>& x_stabilizers, const std::vector<std::size_t>& z_stabilizers) { // Measure X stabilizers h(logicalQubit.ancx); // ... apply controlled-X gates ... h(logicalQubit.ancx); // Measure Z stabilizers // ... apply controlled-X gates ... // Return measurement results return mz(logicalQubit.ancz, logicalQubit.ancx); }
Implementing a New Code in Python
CUDA-Q QEC supports implementing quantum error correction codes in Python
using the @qec.code decorator. This provides a more accessible way
to prototype and develop new codes.
Create a New Python File:
Create a new file (e.g.,
my_steane.py) with your code implementation:import cudaq import cudaq_qec as qec from cudaq_qec import patch
Define Quantum Kernels:
Implement the required quantum kernels using the
@cudaq.kerneldecorator:@cudaq.kernel def prep0(logicalQubit: patch): h(logicalQubit.data[0], logicalQubit.data[4], logicalQubit.data[6]) x.ctrl(logicalQubit.data[0], logicalQubit.data[1]) x.ctrl(logicalQubit.data[4], logicalQubit.data[5]) # ... additional initialization gates ... @cudaq.kernel def stabilizer(logicalQubit: patch, x_stabilizers: list[int], z_stabilizers: list[int]) -> list[bool]: # Measure X stabilizers h(logicalQubit.ancx) for xi in range(len(logicalQubit.ancx)): for di in range(len(logicalQubit.data)): if x_stabilizers[xi * len(logicalQubit.data) + di] == 1: x.ctrl(logicalQubit.ancx[xi], logicalQubit.data[di]) h(logicalQubit.ancx) # Measure Z stabilizers for zi in range(len(logicalQubit.ancz)): for di in range(len(logicalQubit.data)): if z_stabilizers[zi * len(logicalQubit.data) + di] == 1: x.ctrl(logicalQubit.data[di], logicalQubit.ancz[zi]) # Get and reset ancillas results = mz([*logicalQubit.ancz, *logicalQubit.ancx]) reset(logicalQubit.ancx) reset(logicalQubit.ancz) return results
Implement the Code Class:
Create a class decorated with
@qec.codethat implements the required interface:@qec.code('py-steane-example') class MySteaneCodeImpl: def __init__(self, **kwargs): qec.Code.__init__(self, **kwargs) # Define stabilizer generators stabilizers_str = [ "XXXXIII", "IXXIXXI", "IIXXIXX", "ZZZZIII", "IZZIZZI", "IIZZIZZ" ] self.stabilizers = [ cudaq.SpinOperator.from_word(s) for s in stabilizers_str ] # Define observables obs_str = ["IIIIXXX", "IIIIZZZ"] self.pauli_observables = [ cudaq.SpinOperator.from_word(p) for p in obs_str ] # Register quantum kernels self.operation_encodings = { qec.operation.prep0: prep0, qec.operation.stabilizer_round: stabilizer } def get_num_data_qubits(self): return 7 def get_num_ancilla_x_qubits(self): return 3 def get_num_ancilla_z_qubits(self): return 3 def get_num_ancilla_qubits(self): return 6 def get_num_x_stabilizers(self): return 3 def get_num_z_stabilizers(self): return 3
Using the Code:
The code can now be used like any other CUDA-Q QEC code:
import cudaq_qec as qec # Either import your code directly (e.g. "import my_steane", assuming # your my_steane.py file is in the same directory), or you can paste the # above code here without importing the file. import my_steane # Create instance of your code code = qec.get_code('py-steane-example') # Use the code for various numerical experiments
Key Points
The
@qec.codedecorator takes the name of the code as an argumentOperation encodings are registered via the
operation_encodingsdictionaryStabilizer generators are defined using the
qec.StabilizersclassThe code must implement all required methods from the base class interface
Using the Code Framework
To use an implemented code:
# Create a code instance
code = qec.get_code("steane")
# Access stabilizer information
stabilizers = code.get_stabilizers()
parity = code.get_parity()
# The code can now be used for various numerical
# experiments - see section below.
// Create a code instance
auto code = cudaq::qec::get_code("steane");
// Access stabilizer information
auto stabilizers = code->get_stabilizers();
auto parity = code->get_parity();
// The code can now be used for various numerical
// experiments - see section below.
Pre-built QEC Codes
CUDA-Q QEC provides several well-studied quantum error correction codes out of the box. Here’s a detailed overview of each:
Steane Code
The Steane code is a [[7,1,3]] CSS (Calderbank-Shor-Steane) code that encodes
one logical qubit into seven physical qubits with a code distance of 3.
Key Properties:
Data qubits: 7
Encoded qubits: 1
Code distance: 3
Ancilla qubits: 6 (3 for X stabilizers, 3 for Z stabilizers)
Stabilizer Generators:
X-type:
["XXXXIII", "IXXIXXI", "IIXXIXX"]Z-type:
["ZZZZIII", "IZZIZZI", "IIZZIZZ"]
The Steane code can correct any single-qubit error and detect up to two errors. It is particularly notable for being the smallest CSS code that can implement a universal set of transversal gates.
Usage:
import cudaq_qec as qec
# Create Steane code instance
steane = qec.get_code("steane")
auto steane = cudaq::qec::get_code("steane");
Repetition Code
The repetition code is a simple [[n,1,n]] code that protects against bit-flip (X) errors by encoding one logical qubit into n physical qubits, where n is the code distance.
Key Properties:
Data qubits: n (distance)
Encoded qubits: 1
Code distance: n
Ancilla qubits: n-1 (all for Z stabilizers)
Stabilizer Generators:
For distance 3:
["ZZI", "IZZ"]For distance 5:
["ZZIII", "IZZII", "IIZZI", "IIIZZ"]
The repetition code is primarily educational as it can only correct X errors. However, it serves as an excellent introduction to QEC concepts.
Usage:
import cudaq_qec as qec
# Create distance-3 repetition code
code = qec.get_code('repetition', distance=3)
# Access stabilizers
stabilizers = code.get_stabilizers() # Returns ["ZZI", "IZZ"]
auto code = qec::get_code("repetition", {{"distance", 3}});
// Access stabilizers
auto stabilizers = code->get_stabilizers();
Decoder Framework cudaq::qec::decoder
The CUDA-Q QEC decoder framework provides an extensible system for implementing
quantum error correction decoders through the cudaq::qec::decoder base class.
Class Structure
The decoder base class defines the core interface for syndrome decoding:
class decoder {
protected:
std::size_t block_size; // For [n,k] code, this is n
std::size_t syndrome_size; // For [n,k] code, this is n-k
tensor<uint8_t> H; // Parity check matrix
public:
struct decoder_result {
bool converged; // Decoder convergence status
std::vector<float_t> result; // Soft error probabilities
};
virtual decoder_result decode(
const std::vector<float_t>& syndrome) = 0;
virtual std::vector<decoder_result> decode_batch(
const std::vector<std::vector<float_t>>& syndrome);
};
Key Components:
Parity Check Matrix: Defines the code structure via
HBlock Size: Number of physical qubits in the code
Syndrome Size: Number of stabilizer measurements
Decoder Result: Contains convergence status and error probabilities
Multiple Decoding Modes: Single syndrome or batch processing
Implementing a New Decoder in C++
To implement a new decoder:
Create Decoder Class:
class my_decoder : public qec::decoder {
private:
// Decoder-specific members
public:
my_decoder(const tensor<uint8_t>& H,
const heterogeneous_map& params)
: decoder(H) {
// Initialize decoder
}
decoder_result decode(
const std::vector<float_t>& syndrome) override {
// Implement decoding logic
}
};
Register Extension Point:
CUDAQ_EXTENSION_CUSTOM_CREATOR_FUNCTION(
my_decoder,
static std::unique_ptr<decoder> create(
const tensor<uint8_t>& H,
const heterogeneous_map& params) {
return std::make_unique<my_decoder>(H, params);
}
)
CUDAQ_REGISTER_TYPE(my_decoder)
Example: Lookup Table Decoder
Here’s a simple lookup table decoder for the Steane code:
class single_error_lut : public decoder {
private:
std::map<std::string, std::size_t> single_qubit_err_signatures;
public:
single_error_lut(const tensor<uint8_t>& H,
const heterogeneous_map& params)
: decoder(H) {
// Build lookup table for single-qubit errors
for (std::size_t qErr = 0; qErr < block_size; qErr++) {
std::string err_sig(syndrome_size, '0');
for (std::size_t r = 0; r < syndrome_size; r++) {
bool syndrome = 0;
for (std::size_t c = 0; c < block_size; c++)
syndrome ^= (c != qErr) && H.at({r, c});
err_sig[r] = syndrome ? '1' : '0';
}
single_qubit_err_signatures.insert({err_sig, qErr});
}
}
decoder_result decode(
const std::vector<float_t>& syndrome) override {
decoder_result result{false,
std::vector<float_t>(block_size, 0.0)};
// Convert syndrome to string
std::string syndrome_str(syndrome_size, '0');
for (std::size_t i = 0; i < syndrome_size; i++)
syndrome_str[i] = (syndrome[i] >= 0.5) ? '1' : '0';
// Lookup error location
auto it = single_qubit_err_signatures.find(syndrome_str);
if (it != single_qubit_err_signatures.end()) {
result.converged = true;
result.result[it->second] = 1.0;
}
return result;
}
};
Implementing a Decoder in Python
CUDA-Q QEC supports implementing decoders in Python using the @qec.decoder decorator:
Create Decoder Class:
@qec.decoder("my_decoder")
class MyDecoder:
def __init__(self, H, **kwargs):
qec.Decoder.__init__(self, H)
self.H = H
# Initialize with optional kwargs
def decode(self, syndrome):
# Create result object
result = qec.DecoderResult()
# Implement decoding logic
# ...
# Set results
result.converged = True
result.result = [0.0] * self.block_size
return result
Using Custom Parameters:
# Create decoder with custom parameters
decoder = qec.get_decoder("my_decoder",
H=parity_check_matrix,
custom_param=42)
Key Features
Soft Decision Decoding: Results are probabilities in [0,1]
Batch Processing: Support for decoding multiple syndromes
Asynchronous Decoding: Optional async interface for parallel processing
Custom Parameters: Flexible configuration via heterogeneous_map
Python Integration: First-class support for Python implementations
Usage Example
import cudaq_qec as qec
# Get a code instance
steane = qec.get_code("steane")
# Create decoder with code's parity matrix
decoder = qec.get_decoder('single_error_lut', steane.get_parity())
# Run stabilizer measurements
syndromes, dataQubitResults = qec.sample_memory_circuit(steane, numShots=1, numRounds=1)
# Decode a syndrome
result = decoder.decode(syndromes[0])
if result.converged:
print("Error locations:",
[i for i,p in enumerate(result.result) if p > 0.5])
# No errors as we did not include a noise model and
# thus prints:
# Error locations: []
using namespace cudaq;
// Get a code instance
auto code = qec::get_code("steane");
// Create decoder with code's parity matrix
auto decoder = qec::get_decoder("single_error_lut",
code->get_parity());
// Run stabilizer measurements
auto [syndromes, dataQubitResults] = qec::sample_memory_circuit(*code, /*numShots*/numShots, /*numRounds*/ 1);
// Decode syndrome
auto result = decoder->decode(syndromes[0]);
Pre-built QEC Decoders
CUDA-Q QEC provides pre-built decoders for a variety of use cases.
Decoder |
Decoder String Identifier |
Python |
C++ |
Real-Time Enabled |
Notes |
|---|---|---|---|---|---|
NVIDIA QLDPC Decoder¹ |
|
Yes |
Yes |
Yes |
Supports Relay BP and BP+OSD |
Tensor Network Decoder¹ |
|
Yes² |
No |
No |
Exact Maximum Likelihood Decoder |
TensorRT Decoder¹ |
|
Yes³ |
Yes |
Not yet |
AI decoder. Bring your own model. |
Look-Up Table Decoder |
|
Yes |
Yes |
Yes |
Simple decoder with no configurable options |
|
Yes |
Yes |
Yes |
Multi-error decoder that can handle up to “lut_error_depth” errors |
|
Sliding Window Decoder |
|
Yes |
Yes |
Not yet |
Decodes syndromes in a sliding window fashion. May be paired with any other decoder as an inner decoder except Tensor RT Decoder |
pip install cudaq-qec[tensor-network-decoder] for Pythonpip install cudaq-qec[trt-decoder] for PythonHere’s a detailed overview of each:
Quantum Low-Density Parity-Check Decoder
The Quantum Low-Density Parity-Check (QLDPC) decoder leverages GPU-accelerated belief propagation (BP) for efficient error correction.
Since belief propagation is an iterative method which may not converge, decoding can be improved with a second-stage post-processing step. The nv-qldpc-decoder
API provides various post-processing options, which can be selected through its parameters.
Belief Propagation Methods:
The decoder supports multiple BP algorithms (configured via bp_method):
Sum-Product BP (
bp_method=0, default): Classic belief propagation algorithm that computes exact probabilities.Min-Sum BP (
bp_method=1): Approximation to sum-product that uses min operations instead of sum. Optionally acceptsscale_factor.Memory-based BP (
bp_method=2): Min-sum with uniform memory strength across all variable nodes. Requires:gamma0.Disordered Memory BP (
bp_method=3): Min-sum with per-variable memory strengths. Requires:gamma_dist[min, max] ORexplicit_gammas(2D vector).
Sequential Relay Decoding:
Starting with version 0.5.0, the decoder supports Sequential Relay BP (configured via composition=1), which combines disordered memory BP
with multiple “relay legs” - sequential runs with different gamma configurations. Requires: bp_method=3, gamma0, srelay_config, and either gamma_dist OR explicit_gammas.
The QLDPC decoder nv-qldpc-decoder requires a CUDA-Q compatible GPU. See the list below for dependencies and compatibility:
https://nvidia.github.io/cuda-quantum/latest/using/install/local_installation.html#dependencies-and-compatibility
The decoder is based on the following references:
Usage:
import cudaq_qec as qec
import numpy as np
H_list = [
[1, 0, 0, 1, 0, 1, 1],
[0, 1, 0, 1, 1, 0, 1],
[0, 0, 1, 0, 1, 1, 1]
]
H_np = np.array(H_list, dtype=np.uint8)
decoder = qec.get_decoder("nv-qldpc-decoder", H_np)
std::size_t block_size = 7;
std::size_t syndrome_size = 3;
cudaqx::tensor<uint8_t> H;
std::vector<uint8_t> H_vec = {1, 0, 0, 1, 0, 1, 1,
0, 1, 0, 1, 1, 0, 1,
0, 0, 1, 0, 1, 1, 1};
H.copy(H_vec.data(), {syndrome_size, block_size});
cudaqx::heterogeneous_map nv_custom_args;
nv_custom_args.insert("use_osd", true);
auto d1 = cudaq::qec::get_decoder("nv-qldpc-decoder", H, nv_custom_args);
// Alternatively, configure the decoder without instantiating a heterogeneous_map
auto d2 = cudaq::qec::get_decoder("nv-qldpc-decoder", H, {{"use_osd", true}, {"bp_batch_size", 100}});
Tensor Network Decoder
The tensor_network_decoder constructs a tensor network representation of a quantum code given its parity check matrix, logical observable(s), and noise model. It can decode individual syndromes or batches of syndromes, returning the probability that a logical observable has flipped.
Due to the additional dependencies of the Tensor Network Decoder, you must
specify the optional pip package when installing CUDA-Q QEC in order to use this
decoder. Use pip install cudaq-qec[tensor-network-decoder] in order to use
this decoder.
Key Steps:
Define the parity check matrix: This matrix encodes the structure of the quantum code. In the example, a simple [3,1] repetition code is used.
Specify the logical observable: This is typically a row vector indicating which qubits participate in the logical operator.
Set the noise model: The example uses a factorized noise model with independent bit-flip probability for each error mechanism.
Instantiate the decoder: Create a decoder object using
qec.get_decoder("tensor_network_decoder", ...)with the code parameters.Decode syndromes: Use the
decodemethod for single syndromes ordecode_batchfor multiple syndromes.
Usage:
# This example demonstrates how to use the get_decoder("tensor_network_decoder", ...) API
# from the ``cudaq_qec`` library to decode syndromes for a simple
# quantum error-correcting code using tensor networks.
import cudaq_qec as qec
import numpy as np
# Define code parameters
H = np.array([[1, 1, 0], [0, 1, 1]], dtype=np.uint8)
logical_obs = np.array([[1, 1, 1]], dtype=np.uint8)
noise_model = [0.1, 0.1, 0.1]
decoder = qec.get_decoder("tensor_network_decoder", H, logical_obs=logical_obs, noise_model=noise_model)
# Decode a single syndrome
syndrome = [0.0, 1.0]
result = decoder.decode(syndrome)
print(result.result)
# Decode a batch of syndromes
syndrome_batch = np.array([[0.0, 0.0], [0.0, 1.0], [1.0, 0.0]], dtype=np.float32)
batch_results = decoder.decode_batch(syndrome_batch)
for res in batch_results:
print(res.result)
The tensor_network_decoder is a Python-only implementation and it requires Python 3.11 or higher. C++ APIs are not available for this decoder.
Output:
The decoder returns the probability that the logical observable has flipped for each syndrome. This can be used to assess the performance of the code and the decoder under different error scenarios.
Note
In general, the Tensor Network Decoder has the same GPU support as the
Quantum Low-Density Parity-Check Decoder.
However, if you are using the V100 GPU (SM70), you will need to pin your
cuTensor version to 2.2 by running pip install cutensor_cu12==2.2. Note
that this GPU will not be supported by the Tensor Network Decoder when
CUDA-Q 0.5.0 is released.
Real-Time Decoding
CUDA-Q QEC provides real-time decoding capabilities for quantum error correction on actual quantum hardware. Real-time decoding enables decoders to process syndromes and compute corrections within qubit coherence times, making active error correction practical for real quantum computers.
Key Features
In-Kernel Operation: Syndrome decoding within CUDA-Q kernels.
Hardware Integration: Direct integration with quantum hardware backends (Quantinuum’s Helios QPU).
Simulation Support: Test real-time workflows locally before deploying to hardware.
Multiple Decoder Types: For real-time decoders, see the table Pre-built QEC Decoders.
GPU Acceleration: Leverage CUDA for high-performance syndrome decoding.
Note: The real-time decoding interfaces are experimental, and subject to change. Real-time decoding on Quantinuum’s Helios-1 device is currently only available to partners and collaborators. Please email QCSupport@quantinuum.com for more information.
Workflow and Terminology
The real-time decoding workflow involves configuring a decoder (or many) before CUDA-Q kernel launch, and communicating to the decoders with special in-kernel functions. A decoder is a single software instance of a decoding algorithm, and all its relevant inputs (parity-check matrices, error rates, etc.) which will remain static for the execution of the quantum program. A decoder config may contain many decoders, each with different algorithms and input parameters.
In a quantum kernel, a user interacts with the decoders via the enqueue_syndromes and get_corrections interfaces.
The behavior of these functions depends on their configuration and their usage.
The real-time decoding workflow can be described with respect to the offline decoding workflow.
The non-real-time decoders require a detector error model which is specified via a detector error matrix which is the parity check matrix H of the decoding problem, and a vector of weights (error rates).
This matrix has dimensions of [numDetectors, numErrors], where the each row is a detector, and each column is a possible error.
For real-time decoding, we first need to convert the circuit measurements into detectors.
This is specified via the detector matrix D, which has dimensions [numDetectors, numMeasurements].
Each column of the detector matrix defines which detectors a measurement participates in by including an entry of 1.
This when, once all numMeasurements measurements are enqueued, a matrix-vector multiply can convert this buffer of raw measurements into detectors which are then passed into the decoding algorithm.
Similarly, an observables flips matrix O of size [numObs, numErrors] must be provided.
Each column of the observables flips matrix describes for each error, which observables are flipped by that error by including an entry of 1.
Once the decoding algorithm has process the detectors it provides a vector of predicted errors of length numErrors.
This vector then executes a matrix-vector multiply with the observables flips matrix to yield a new vector of length numObs which contains an entry of 1 if the observable is predicted to have flipped.
Thus once a decoder is configured, we can view the real-time decoder as a transformation of data starting from a vector of raw measurements, then transformed into detectors via D, then error predictions via H, then observable flip predictions via O. This last step is what is returned via get_corrections. The user configures how many bits of information are returned, and what they represent via the O matrix in the decoder config.
Similarly, the user determines how many measurements are needed for the decoder via the D matrix in the decoder config, and they are sent to the decoder via enqueue_syndromes.
For flexibility, the user can choose to send all measurements with a single enqueue_syndromes call, or send them over several calls.
However they are split up, the decoder will not begin decoding until all numMeasurements have been enqueued, and will throw an error if too many are sent.
Thus it is the final enqueue_syndromes call which kicks off the decoder, and is an asynchronous function.
Additional quantum gates can be applied, and only when get_corrections is called does the kernel sync and wait for the corrections.
For detailed information on real-time decoding, see:
Real-Time Decoding - Complete Guide with Examples
CUDA-Q QEC C++ API - C++ API Reference (see Real-Time Decoding section)
CUDA-Q QEC Python API - Python API Reference (see Real-Time Decoding section)
Numerical Experiments
CUDA-Q QEC provides utilities for running numerical experiments with quantum error correction codes.
Conventions
To address vectors of qubits (cudaq::qvector), CUDAQ indexing starts from 0, and 0 corresponds
to the leftmost position when working with pauli strings (cudaq::spin_op). For example, applying a pauli X operator
to qubit 1 out of 7 would be X_1 = IXIIIII.
While implementing your own codes and decoders, you are free to follow any convention that is convenient to you. However,
to interact with the pre-built QEC codes and decoders within this library, the following conventions are used. All of these codes
are CSS codes, and so we separate \(X\)-type and \(Z\)-type errors. For example, an error vector for 3 qubits will
have 6 entries, 3 bits representing the presence of a bit-flip on each qubit, and 3 bits representing a phase-flip on each qubit.
An error vector representing a bit-flip on qubit 0, and a phase-flip on qubit 1 would look like E = 100010. This means that this
error vector is just two error vectors (E_X, E_Z) concatenated together (E = E_X | E_Z).
These errors are detected by stabilizers. \(Z\)-stabilizers detect \(X\)-type errors and vice versa. Thus we write our CSS parity check matrices as
so that when we generate a syndrome vector by multiplying the parity check matrix by an error vector we get
This means that for the concatenated syndrome vector S = S_X | S_Z, the first part, S_X, are syndrome bits triggered by Z
stabilizers detecting X errors. This is because the Z stabilizers like ZZI and IZZ anti-commute with X errors like
IXI.
The decoder prediction as to what error happened is D = D_X | D_Z. A successful error decoding does not require that D = E,
but that D + E is not a logical operator. There are a couple ways to check this.
For bitflip errors, we check that the residual error R = D_X + E_X is not L_X. Since X anticommutes
with Z, we can check that L_Z(D_X + E_X) = 0. This is because we just need to check if they have mutual support on an even
or odd number of qubits. We could also check that R is not a stabilizer.
Similar to the parity check matrix, the logical observables are also stored in a matrix as
so that when determining logical errors, we can do matrix multiplication
Here we’re using P as this can be stored in a Pauli frame tracker to track observable flips.
Each logical qubit has logical observables associated with it. Depending on what basis the data qubits are measured in, either the
X or Z logical observables can be measured. The data qubits which support the logical observable is contained the qec::code class as well.
To do a logical Z(X) measurement, measure out all of the data qubits in the Z(X) basis. Then check support on the appropriate
Z(x) observable.
Memory Circuit Experiments
Memory circuit experiments test a QEC code’s ability to preserve quantum information over time by:
Preparing an initial logical state
Performing multiple rounds of stabilizer measurements
Measuring data qubits to verify state preservation
Optionally applying noise during the process
Function Variants
import cudaq
import cudaq_qec as qec
# Use the stim backend for performance in QEC settings
cudaq.set_target("stim")
# Get a code instance
code = qec.get_code("steane")
# Basic memory circuit with |0⟩ state
syndromes, measurements = qec.sample_memory_circuit(
code, # QEC code instance
numShots=1000, # Number of circuit executions
numRounds=1 # Number of stabilizer rounds
)
# Memory circuit with custom initial state
syndromes, measurements = qec.sample_memory_circuit(
code, # QEC code instance
op=qec.operation.prep1, # Initial state
numShots=1000, # Number of shots
numRounds=1 # Number of rounds
)
# Memory circuit with noise model
noise = cudaq.NoiseModel()
# Configure noise
noise.add_all_qubit_channel("x", cudaq.Depolarization2(0.01), 1)
syndromes, measurements = qec.sample_memory_circuit(
code, # QEC code instance
numShots=1000, # Number of shots
numRounds=1, # Number of rounds
noise=noise # Noise model
)
// Basic memory circuit with |0⟩ state
auto [syndromes, measurements] = qec::sample_memory_circuit(
code, // QEC code instance
numShots, // Number of circuit executions
numRounds // Number of stabilizer rounds
);
// Memory circuit with custom initial state
auto [syndromes, measurements] = qec::sample_memory_circuit(
code, // QEC code instance
operation::prep1, // Initial state preparation
numShots, // Number of circuit executions
numRounds // Number of stabilizer rounds
);
// Memory circuit with noise model
auto noise_model = cudaq::noise_model();
noise_model.add_channel(...); // Configure noise
auto [syndromes, measurements] = qec::sample_memory_circuit(
code, // QEC code instance
numShots, // Number of circuit executions
numRounds, // Number of stabilizer rounds
noise_model // Noise model to apply
);
Return Values
The functions return a tuple containing:
Syndrome Measurements (
tensor<uint8_t>):Shape:
(num_shots, num_rounds * syndrome_size)Contains stabilizer measurement results
Values are 0 or 1 representing measurement outcomes
Data Measurements (
tensor<uint8_t>):Shape:
(num_shots, block_size)Contains final data qubit measurements
Used to verify logical state preservation
Example Usage
Example of running a memory experiment:
import cudaq
import cudaq_qec as qec
# Use the stim backend for performance in QEC settings
cudaq.set_target("stim")
# Create code and decoder
code = qec.get_code('steane')
decoder = qec.get_decoder('single_error_lut',
code.get_parity())
# Configure noise
noise = cudaq.NoiseModel()
noise.add_all_qubit_channel("x", cudaq.Depolarization2(0.01), 1)
# Run memory experiment
syndromes, measurements = qec.sample_memory_circuit(
code,
op=qec.operation.prep0,
numShots=1000,
numRounds=10,
noise=noise
)
# Analyze results
for shot in range(1000):
# Get syndrome for this shot
syndrome = syndromes[shot].tolist()
# Decode syndrome
result = decoder.decode(syndrome)
if result.converged:
# Process correction
pass
// Compile and run with:
// nvq++ --enable-mlir --target=stim -lcudaq-qec example.cpp
// ./a.out
#include "cudaq.h"
#include "cudaq/qec/decoder.h"
#include "cudaq/qec/experiments.h"
#include "cudaq/qec/noise_model.h"
int main(){
// Create a Steane code instance
auto code = cudaq::qec::get_code("steane");
// Configure noise model
cudaq::noise_model noise;
noise.add_all_qubit_channel("x", cudaq::depolarization2(0.1),
/*num_controls=*/1);
// Run memory experiment
auto [syndromes, data] = cudaq::qec::sample_memory_circuit(
*code, // Code instance
cudaq::qec::operation::prep0, // Prepare |0⟩ state
1000, // 1000 shots
1, // 1 rounds
noise // Apply noise
);
// Analyze results
auto decoder = cudaq::qec::get_decoder("single_error_lut", code->get_parity());
for (std::size_t shot = 0; shot < 1000; shot++) {
// Get syndrome for this shot
std::vector<cudaq::qec::float_t> syndrome(syndromes.shape()[1]);
for (std::size_t i = 0; i < syndrome.size(); i++)
syndrome[i] = syndromes.at({shot, i});
// Decode syndrome
auto results = decoder->decode(syndrome);
// Process correction
// ...
}
}
Additional Noise Models
noise = cudaq.NoiseModel()
# Add multiple error channels
noise.add_all_qubit_channel('h', cudaq.BitFlipChannel(0.001))
# Specify two qubit errors
noise.add_all_qubit_channel("x", cudaq.Depolarization2(p), 1)
cudaq::noise_model noise;
// Add multiple error channels
noise.add_all_qubit_channel(
"x", cudaq::bit_flip_channel(/*probability*/ 0.01));
// Specify two qubit errors
noise.add_all_qubit_channel(
"x", cudaq::depolarization2(/*probability*/ 0.01),
/*numControls*/ 1);