CUDA-Q Accelerates Quantum Workloads at GTC 2026

Over the past year the quantum computing industry has shifted from qubit demonstrations to the work of building large-scale quantum-classical supercomputers. Central to that transition has been the intersection of quantum computing work with AI supercomputing. At GTC 2026, researchers and developers across the ecosystem are demonstrating how the NVIDIA CUDA-Q platform is bringing accelerated computing to the key workloads set to define quantum computing in 2026 and beyond.

Quantum Application Development

Building better quantum algorithms remains a critical workload for shortening the timeline to utility-scale quantum computing. GPU-accelerated simulation is the central tool for exploring and refining these algorithms, and teams around the world are continuing to push applications work to new levels with CUDA-Q's access to GPU-accelerated simulations.

CINECA and Kipu Quantum have run what is believed to be the largest-known statevector simulation of a quantum optimization routine — a 43-qubit simulation of a novel Quantum-Enhanced Memetic Tabu Search (QE-MTS) algorithm on 2,048 NVIDIA Ampere GPUs. QE-MTS targets Low Autocorrelation Binary Sequence (LABS) problems, a challenging class of combinatorial optimization with applications in signal processing and communications.

Infleqtion is applying CUDA-Q to biomarker discovery through its Q4Bio project, using quantum neural networks to identify high-impact feature sets from complex cancer data. The team has consumed 24,000 NVIDIA A100 GPU-node-hours on the Perlmutter supercomputer at NERSC to train these networks. Infleqtion uses a hybrid combination of its Sqale QPU and NVIDIA GPUs for inference, and is exploring NVIDIA NVQLink to scale execution further.

Hypergraph of interactions in Infleqtion Work

Fig 1: Hypergraph showing interactions between logical qubits in Infleqtion's demonstration of quantum neural networks

A collaboration between UCL, QMatter, the Technical University of Munich, Ludwig-Maximilians-Universität München, LRZ, and IQM demonstrated a hybrid biomolecular simulation of a G-protein-coupled receptor (GPCR) on the CUDA-Q platform, with the quantum component executed on Euro-Q-Exa — an IQM 54-qubit system, alongside NVIDIA H100 GPUs hosted at the LRZ BayernKI AI cluster. The postprocessing element of the work was further scaled to a billion quantum configurations using 1,200 NVIDIA H100 on the Eos cluster

Fig 2: GPCR protein active site modelled in collaboration between UCL, QMatter, TUM, LMU, LRZ and IQM

These results join a broader wave of GPU-accelerated quantum application work showcased at GTC 2026:

Aegiq used DGX Spark to prototype hundred-qubit tensor-network-based CFD simulation workloads, that could then be scaled to thousands of qubits.
Classiq built an end-to-end path from high-level quantum modeling to CUDA-Q execution, reducing synthesis and execution of a 31-qubit circuit from 67 minutes to 2.5 minutes on a single A100 GPU.
TII built a cuTENSOR-based emulator for adiabatic quantum annealing, demonstrating simulations of up to 500,000 qubits with \(7.5 \times 10^9\) two-qubit entangling gates.
ORCA Computing built a GPU-accelerated photonic simulator on NVIDIA cuQuantum to model large photonic circuits.
JIJ used CUDA-Q to implement Photonic Quantum-Enhanced Knowledge Distillation - a hybrid quantum-classical framework for efficiently compressing convolutional neural networks, validated on MNIST, Fashion-MNIST, and CIFAR-10.
OTI Lumionics ran their quantum-native iQCC algorithm for materials design, cutting calculation time from 109 hours on a 32-CPU cluster to 1.2 hours on a single NVIDIA Blackwell GPU.
Kvantify benchmarked ground-state energy calculations for penicillin on DGX Spark using its Kvantify Qrunch platform.

Fig 3: Kvantify results showing performance of DGX Spark with CUDA-Q for quantum computing simulation workloads

Quantum Error Correction

CUDA-Q is also proving essential for research into quantum error correction — work critical to the long-term viability of quantum computing.

Researchers at the University of Edinburgh, working through the Grace Hopper Superchip Seed Program with Supermicro, have performed breakthrough work by developing a vibe decoder for color codes. By combining an ensemble of Belief Propagation (BP) decoders parallelized with CUDA on NVIDIA GH200 GPUs, vibe decoding achieves microsecond per-round batched throughput — 900x faster than previous state-of-the-art — providing the most compelling evidence yet that color code decoding could compete with industry-standard surface codes.

Alice & Bob are using CUDA-Q's GPU acceleration to run noisy simulations validating elevator codes, a new error correction code designed for the biased noise in their cat qubits. The team benchmarked a 9.25x speedup over CPU-based techniques when decoding a \([15,9,3]\), \(d_z=9\) elevator code from 100,000 simulated shots on an NVIDIA GH200.

Quantum-Classical Integrations

Providing developers the tools they need to work with future quantum-GPU supercomputers is key to advancing quantum computing. Quantum software builders are drawing on CUDA-Q to ensure their tools seamlessly support hybrid systems.

Scaleway now supports CUDA-Q kernels from its Quantum-as-a-Service platform, with execution across hardware from Alpine Quantum Technologies, while PsiQuantum has integrated CUDA-Q with PsiQuantum Construct for simulating large-scale, fault-tolerant hybrid applications.

TII has integrated its Qibo framework with CUDA-Q for hybrid workload validation - planning to use NVQLink in future work, and Pasqal has integrated CUDA-Q with its Quantum Resource Management Interface (QRMI) runtime, enabling SLURM-based scheduling of work across Pasqal QPUs and accelerated computing.

Meanwhile memQ announced its Extensible Distributed Quantum Compiler (xDQC) built on CUDA-Q, demonstrating how hybrid applications can be spread across multi-QPU systems - simulating hundreds of qubits across eight QPUs, and HPE has incorporated NVQLink into its vision for quantum supercomputing.

AI-Powered Quantum Development

Quantum researchers are looking to the latest AI advances and CUDA-Q is establishing itself as the backbone for AI-driven quantum development - making quantum algorithm and hardware design faster and more accessible.

Hiverge has integrated CUDA-Q with the Hive, an AI platform that uses LLM agents to translate natural-language problem descriptions into executable quantum circuits. Quantinuum is using the Hive for AI-guided discovery of quantum algorithms for combinatorial optimization, with applications in power grid dispatch, nuclear fuel arrangement, and molecular design for drug and materials discovery.

QCentroid has integrated CUDA-Q with its QuantumOps platform, giving enterprise users access through an AI copilot layer. And Conductor Quantum announced Coda's new Model Context Protocol (MCP), which gives AI agents access to quantum compute resources as tools — supporting CUDA-Q transpilation from other frameworks and circuit simulations of up to 34 qubits for autonomous debugging and optimization.

Get Started

The work at GTC 2026 shows hybrid quantum-classical supercomputing maturing rapidly across applications, error correction, integrations, and AI-driven development — with CUDA-Q as the common thread.

Share on Share on Share on LinkedIn