CUDA-Q QEC Python API
Code
Detector Error Model
Decoder Interfaces
Built-in Decoders
NVIDIA QLDPC Decoder
- class nv_qldpc_decoder
A general purpose Quantum Low-Density Parity-Check Decoder (QLDPC) decoder based on GPU accelerated belief propagation (BP). Since belief propagation is an iterative method, decoding can be improved with a second-stage post-processing step. Optionally, ordered statistics decoding (OSD) can be chosen to perform the second stage of decoding.
An [[n,k,d]] quantum error correction (QEC) code encodes k logical qubits into an n qubit data block, with a code distance d. Quantum low-density parity-check (QLDPC) codes are characterized by sparse parity-check matrices (or Tanner graphs), corresponding to a bounded number of parity checks per data qubit.
Requires a CUDA-Q compatible GPU. See the CUDA-Q GPU Compatibility List for a list of valid GPU configurations.
References: Decoding Across the Quantum LDPC Code Landscape
Note
It is required to create decoders with the
get_decoderAPI from the CUDA-QX extension points API, such asimport cudaq_qec as qec import numpy as np H = np.array([[1, 0, 0, 1, 0, 1, 1], [0, 1, 0, 1, 1, 0, 1], [0, 0, 1, 0, 1, 1, 1]], dtype=np.uint8) # sample 3x7 PCM opts = dict() # see below for options # Note: H must be in row-major order. If you use # `scipy.sparse.csr_matrix.todense()` to get the parity check # matrix, you must specify todense(order='C') to get a row-major # matrix. nvdec = qec.get_decoder('nv-qldpc-decoder', H, **opts)
std::size_t block_size = 7; std::size_t syndrome_size = 3; cudaqx::tensor<uint8_t> H; std::vector<uint8_t> H_vec = {1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1}; H.copy(H_vec.data(), {syndrome_size, block_size}); cudaqx::heterogeneous_map nv_custom_args; nv_custom_args.insert("use_osd", true); // See below for options auto nvdec = cudaq::qec::get_decoder("nv-qldpc-decoder", H, nv_custom_args);
Note
The
"nv-qldpc-decoder"implements thecudaq_qec.Decoderinterface for Python and thecudaq::qec::decoderinterface for C++, so it supports all the methods in those respective classes.- Parameters:
H – Parity check matrix (tensor format)
params –
Heterogeneous map of parameters:
use_sparsity(bool): Whether or not to use a sparse matrix solvererror_rate(double): Probability of an error (in 0-1 range) on a block data bit (defaults to 0.001)error_rate_vec(double): Vector of length “block size” containing the probability of an error (in 0-1 range) on a block data bit (defaults to 0.001). This overrideserror_rate.max_iterations(int): Maximum number of BP iterations to perform (defaults to 30)n_threads(int): Number of CUDA threads to use for the GPU decoder (defaults to smart selection based on parity matrix size)use_osd(bool): Whether or not to use an OSD post processor if the initial BP algorithm fails to converge on a solutionosd_method(int): 1=OSD-0, 2=Exhaustive, 3=Combination Sweep (defaults to 1). Ignored unlessuse_osdis true.osd_order(int): OSD postprocessor order (defaults to 0). Ref: Decoding Across the Quantum LDPC Code LandscapeFor
osd_method=2(Exhaustive), the number of possible permutations searched after OSD-0 grows by 2^osd_order.For
osd_method=3(Combination Sweep), this is the λ parameter. All weight 1 permutations and the first λ bits worth of weight 2 permutations are searched after OSD-0. This is (syndrome_length - block_size + λ * (λ - 1) / 2) additional permutations.For other
osd_methodvalues, this is ignored.
bp_batch_size(int): Number of syndromes that will be decoded in parallel for the BP decoder (defaults to 1)osd_batch_size(int): Number of syndromes that will be decoded in parallel for OSD (defaults to the number of concurrent threads supported by the hardware)iter_per_check(int): Number of iterations between BP convergence checks (defaults to 1, and max ismax_iterations). Introduced in 0.4.0.clip_value(float): Value to clip the BP messages to. Should be a non-negative value (defaults to 0.0, which disables clipping). Introduced in 0.4.0.bp_method(int): Core BP algorithm to use (defaults to 0). Introduced in 0.4.0, expanded in 0.5.0:0: sum-product
1: min-sum (introduced in 0.4.0)
2: min-sum+mem (uniform memory strength, introduced in 0.5.0)
3: min-sum+dmem (disordered memory strength, introduced in 0.5.0)
composition(int): Iteration strategy (defaults to 0). Introduced in 0.5.0:0: Standard (single run)
1: Sequential relay (multiple gamma legs). Requires:
bp_method=3,srelay_config
scale_factor(float): The scale factor to use for min-sum. Defaults to 1.0. When set to 0.0, the scale factor is dynamically computed based on the number of iterations. Introduced in 0.4.0.gamma0(float): Memory strength parameter. Required forbp_method=2, and forcomposition=1(sequential relay). Introduced in 0.5.0.gamma_dist(vector<float>): Gamma distribution interval [min, max] for disordered memory strength. Required forbp_method=3ifexplicit_gammasnot provided. Introduced in 0.5.0.explicit_gammas(vector<vector<float>>): Explicit gamma values for each variable node. Forbp_method=3withcomposition=0, provide a 2D vector where each row hasblock_sizecolumns. Forcomposition=1(Sequential relay), providenum_setsrows (one per relay leg). Overridesgamma_distif provided. Introduced in 0.5.0.srelay_config(heterogeneous_map): Sequential relay configuration (required forcomposition=1). Contains the following parameters. Introduced in 0.5.0:pre_iter(int): Number of pre-iterations to run before relay legsnum_sets(int): Number of relay sets (legs) to runstopping_criterion(string): When to stop relay legs:”All”: Run all legs
”FirstConv”: Stop relay after first convergence
”NConv”: Stop after N convergences (requires
stop_nconvparameter)
stop_nconv(int): Number of convergences to wait for before stopping (required only whenstopping_criterion="NConv")
bp_seed(int): Seed for random number generation used inbp_method=3(disordered memory BP). Optional parameter, defaults to 42 if not provided. Introduced in 0.5.0.opt_results(heterogeneous_map): Optional results to return. This field can be left empty if no additional results are desired. Choices are:bp_llr_history(int): Return the lastbp_llr_historyiterations of the BP LLR history. Minimum value is 0 and maximum value is max_iterations. The actual number of returned iterations might be fewer thanbp_llr_historyif BP converges before the requested number of iterations. Introduced in 0.4.0. Note: Not supported forcomposition=1.num_iter(bool): If true, return the number of BP iterations run. Introduced in 0.5.0.
Tensor Network Decoder
- class cudaq_qec.plugin.decoders.tensor_network_decoder.TensorNetworkDecoder
A general class for tensor network decoders for quantum error correction codes.
This decoder constructs a tensor network representation of a quantum code using its parity check matrix, logical observables, and noise model. The tensor network is based on the Tanner graph of the code and can be contracted to compute the probability that a logical observable has flipped, given a syndrome.
The decoder supports both single-syndrome and batch decoding, and can run on CPU or GPU (using cuTensorNet if available).
The Tensor Network Decoder is a Python-only implementation and it requires Python 3.11 or higher. C++ APIs are not available for this decoder.
Due to the additional dependencies of the Tensor Network Decoder, you must specify the optional pip package when installing CUDA-Q QEC in order to use this decoder. Use
pip install cudaq-qec[tensor-network-decoder]in order to use this decoder.The Tensor Network Decoder has the same GPU support as the Quantum Low-Density Parity-Check Decoder. However, if you are using the V100 GPU (SM70), you will need to pin your cuTensor version to 2.2 by running
pip install cutensor_cu12==2.2. Note that this GPU will not be supported by the Tensor Network Decoder when CUDA-Q 0.5.0 is released.Note
It is recommended to create decoders using the
cudaq_qecplugin API:import cudaq_qec as qec import numpy as np # Example: [3,1] repetition code H = np.array([[1, 1, 0], [0, 1, 1]], dtype=np.uint8) logical_obs = np.array([[1, 1, 1]], dtype=np.uint8) noise_model = [0.1, 0.1, 0.1] decoder = qec.get_decoder("tensor_network_decoder", H, logical_obs=logical_obs, noise_model=noise_model) syndrome = [0.0, 1.0] result = decoder.decode(syndrome)
Tensor Network Structure
The tensor network constructed by this decoder is based on the Tanner graph of the code, extended with noise and logical observable tensors. The structure is illustrated below:
open/output index < logical observable -------- | s1 s2 | s3 < syndromes : product of 2D vectors [1 , 1-2pi] (pi is the probability detector i flipped) | | | | ----| c1 c2 l1 c3 < checks / logical | : delta tensors | / | | \ | | H H H H H H < Hadamard matrices | TANNER (bipartite) GRAPH \ | | / | / | e1 e2 e3 < errors | : delta tensors | | / -----| \ / / P(e1, e2, e3) < noise / error model : classical probability density ci, ej, lk are delta tensors represented sparsely as indices.- Parameters:
H – Parity check matrix (numpy.ndarray), shape (num_checks, num_qubits)
logical_obs – Logical observable matrix (numpy.ndarray), shape (1, num_qubits)
noise_model – Noise model, either a list of probabilities (length = num_qubits) or a quimb.tensor.TensorNetwork
check_inds – (optional) List of check index names
error_inds – (optional) List of error index names
logical_inds – (optional) List of logical index names
logical_tags – (optional) List of logical tags
contract_noise_model – (bool, optional) Whether to contract the noise model at initialization (default: True)
dtype – (str, optional) Data type for tensors (default: “float32”)
device – (str, optional) Device for tensor operations (“cpu”, “cuda”, or “cuda:X”, default: “cuda”)
Methods
- decode(syndrome)
Decode a single syndrome by contracting the tensor network.
- Parameters:
syndrome – List of float values (soft-decision probabilities) for each check.
- Returns:
DecoderResult with the probability that the logical observable flipped.
- decode_batch(syndrome_batch)
Decode a batch of syndromes.
- Parameters:
syndrome_batch – numpy.ndarray of shape (batch_size, num_checks)
- Returns:
List of DecoderResult objects with the probability that the logical observable has flipped for each syndrome.
- optimize_path(optimize=None, batch_size=-1)
Optimize the contraction path for the tensor network.
- Parameters:
optimize – Optimization options or None
batch_size – (int, optional) Batch size for optimization (default: -1, no batching)
- Returns:
Optimizer info object