Welcome to The cuDNN Blog

Hey there, fellow GPU enthusiast! This is the unofficial-feeling, totally-official corner of the internet where we talk about cuDNN — NVIDIA’s library for accelerating deep learning primitives on GPUs.

Whether you’re training a massive transformer, fine-tuning a convolutional network, or just trying to get GPUs to go brrr, cuDNN is the engine under the hood making it happen. This blog is where we share release notes, installation guides, and the occasional deep-dive into what makes cuDNN tick.

Check out the Installation Guides in the sidebar to get started, or read through the latest release notes below.

Latest Releases

May 02, 2026
cuDNN Frontend v1.23.0

Causal Conv1d, expanded Graph API (transpose, strided slice, in-place concat, reshape modes, compile-time scalars), new open-source GEMM/MoE kernels, and Python 3.14t wheels.
May 02, 2026
Watch our talk on GPU MODE: design choices in cuDNN attention

We joined the GPU MODE channel to walk through the design choices behind cuDNN's attention kernels. Watch on YouTube.
April 20, 2026
The 128×4 Tiled Layout for Block Scaling Factors

How cuDNN expects MXFP8 and NVFP4 block scaling factors to be laid out in the 128×4 tiled format on Blackwell GPUs, and how to convert to/from row-major.
April 20, 2026
How Scales Are Applied in MXFP8 Attention

A deep dive into how cuDNN applies block-wise and fixed scaling in microscaling FP8 (MXFP8) attention for Blackwell GPUs.
April 20, 2026
cuDNN Backend Now Has Preview Releases

Try upcoming cuDNN backend features early with pip install --pre. Stable releases remain unchanged.
April 10, 2025
cuDNN Frontend v1.22.1

PyTorch custom op for MoE Grouped GEMM, Blackwell SDPA forward kernel (head dim 256), and weight-gradient kernels.
April 03, 2025
cuDNN Frontend v1.22.0

PyTorch custom op for SDPA, preindexed execute, Blackwell bprop kernels, and Grouped GEMM improvements.
March 25, 2025
cuDNN Frontend v1.21.0

No more CUDA driver dependency, plus a wave of new Grouped GEMM fusion kernels for MoE workloads.
March 16, 2025
cuDNN Frontend v1.20.0

Fused RMSNorm + SiLU kernel for B200, GB300 support for GEMM kernels, and reproducer tool improvements.
March 11, 2025
cuDNN Frontend v1.19.1

Hotfix restoring CUDA 12 support, plus the full v1.19.0 feature set including open-source SDPA kernels.