RAPIDS Accelerator For Apache Spark provides a set of plugins for Apache Spark that leverage GPUs to accelerate Dataframe and SQL processing.

The accelerator is built upon the RAPIDS cuDF project and UCX.

The RAPIDS Accelerator For Apache Spark requires each worker node in the cluster to have CUDA installed.

The RAPIDS Accelerator For Apache Spark consists of two jars: a plugin jar along with the RAPIDS cuDF jar, that is either preinstalled in the Spark classpath on all nodes or submitted with each job that uses the RAPIDS Accelerator For Apache Spark. See the getting-started guide for more details.

Release v25.08.0

Hardware Requirements:

The plugin is designed to work on NVIDIA Volta, Turing, Ampere, Ada Lovelace, Hopper and Blackwell generation datacenter GPUs. The plugin jar is tested on the following GPUs:

GPU Models: NVIDIA V100, T4, A10, A100, L4, H100 and B100 GPUs

Software Requirements:

OS: Spark RAPIDS is compatible with any Linux distribution with glibc >= 2.28 (Please check ldd --version output).  glibc 2.28 was released August 1, 2018. 
Tested on Ubuntu 20.04, Ubuntu 22.04, Rocky Linux 8 and Rocky Linux 9

NVIDIA Driver*: R525+

Runtime: 
	Scala 2.12, 2.13
	Python, Java Virtual Machine (JVM) compatible with your spark-version. 

	* Check the Spark documentation for Python and Java version compatibility with your specific 
	Spark version. For instance, visit `https://spark.apache.org/docs/3.4.1` for Spark 3.4.1.

Supported Spark versions:
	Apache Spark 3.2.0, 3.2.1, 3.2.2, 3.2.3, 3.2.4
	Apache Spark 3.3.0, 3.3.1, 3.3.2, 3.3.3, 3.3.4
	Apache Spark 3.4.0, 3.4.1, 3.4.2, 3.4.3, 3.4.4
	Apache Spark 3.5.0, 3.5.1, 3.5.2, 3.5.3, 3.5.4, 3.5.5, 3.5.6
	Apache Spark 4.0.0

Supported Databricks runtime versions for Azure and AWS:
	Databricks 12.2 ML LTS (GPU, Scala 2.12, Spark 3.3.2)
	Databricks 13.3 ML LTS (GPU, Scala 2.12, Spark 3.4.1)
	Databricks 14.3 ML LTS (GPU, Scala 2.12, Spark 3.5.0)

Supported Dataproc versions (Debian/Ubuntu/Rocky):
	GCP Dataproc 2.1
	GCP Dataproc 2.2
	GCP Dataproc 2.3

Supported Dataproc Serverless versions:
	Spark runtime 1.1 LTS
	Spark runtime 1.2
	Spark runtime 2.0
	Spark runtime 2.1
	Spark runtime 2.2

*Some hardware may have a minimum driver version greater than R470. Check the GPU spec sheet for your hardware’s minimum driver version.

*For Cloudera and EMR support, please refer to the Distributions section of the FAQ.

RAPIDS Accelerator’s Support Policy for Apache Spark

The RAPIDS Accelerator maintains support for Apache Spark versions available for download from Apache Spark

Download RAPIDS Accelerator for Apache Spark v25.08.0

Processor Scala Version Download Jar Download Signature
x86_64 Scala 2.12 RAPIDS Accelerator v25.08.0 Signature
x86_64 Scala 2.13 RAPIDS Accelerator v25.08.0 Signature
arm64 Scala 2.12 RAPIDS Accelerator v25.08.0 Signature
arm64 Scala 2.13 RAPIDS Accelerator v25.08.0 Signature

This package is built against CUDA 12.9. It is tested on V100, T4, A10, A100, L4, H100 and GB100 GPUs with CUDA 12.9.

Verify signature

  • Download the PUB_KEY.
  • Import the public key: gpg --import PUB_KEY
  • Verify the signature for Scala 2.12 jar: gpg --verify rapids-4-spark_2.12-25.08.0.jar.asc rapids-4-spark_2.12-25.08.0.jar
  • Verify the signature for Scala 2.13 jar: gpg --verify rapids-4-spark_2.13-25.08.0.jar.asc rapids-4-spark_2.13-25.08.0.jar

The output of signature verify:

gpg: Good signature from "NVIDIA Spark (For the signature of spark-rapids release jars) <sw-spark@nvidia.com>"

Release Notes

  • Spark 4.0 support, including ANSI mode support for multiply, AVG and SUM aggregations with improved performance. Compatibility with Spark 4.0 exception handling and type casting.
  • Support for Spark Connect in Spark 3.5.6 and Spark 4.0
  • Delta Lake 3.3.x open source support with read, update, merge, delete, optimized write and auto compact functionality. Read is supported without deletion vectors.
  • Iceberg S3 tables support
  • Support scalar * scalar overload function for enhanced mathematical operations
  • Improved expression combining and side-effect checking for GpuCaseWhen operations
  • Better Parquet type conversion for improved data processing
  • Improved memory management with fixes for memory leaks and overflow handling
  • Performance optimizations for GPU kernel usage and aggregation operations
  • Stability improvements and bug fixes for production workloads

Note: There is a known issue in the 25.08.0 release when decompressing gzip files on H100 GPUs. Please find more details in issue-16661.

For a detailed list of changes, please refer to the CHANGELOG.

Archived releases

As new releases come out, previous ones will still be available in archived releases.