Apache Spark 3.0+ lets users provide a plugin that can replace the backend for SQL and DataFrame operations. This requires no API changes from the user. The plugin will replace SQL operations it supports with GPU accelerated versions. If an operation is not supported it will fall back to using the Spark CPU version. Note that the plugin cannot accelerate operations that manipulate RDDs directly.
The accelerator library also provides an implementation of Spark’s shuffle that can leverage UCX to optimize GPU data transfers keeping as much data on the GPU as possible and bypassing the CPU to do GPU to GPU transfers.
The GPU accelerated processing plugin does not require the accelerated shuffle implementation. However, if accelerated SQL processing is not enabled, the shuffle implementation falls back to the default
To enable GPU processing acceleration you will need:
- Apache Spark 3.1+
- A Spark cluster configured with GPUs that comply with the requirements for RAPIDS.
- One GPU per executor.
- The RAPIDS Accelerator for Apache Spark plugin jar.
- To set the config
Apache Spark 3.0 now supports GPU scheduling as long as you are using a cluster manager that supports it. You can have Spark request GPUs and assign them to tasks. The exact configs you use will vary depending on your cluster manager. Here are some example configs:
- Request your executor to have GPUs:
- Specify the number of GPUs per task:
- Specify a GPU discovery script (required on YARN and K8S):
- Explain why some operations of a query were not placed on a GPU or not:
--conf spark.rapids.sql.explain=ALLwill display whether each operation is placed on GPU.
--conf spark.rapids.sql.explain=NOT_ON_GPUwill display only parts that did not go on the GPU, and it’s the default setting.
--conf spark.rapids.sql.explain=NONEwill disable the log of
See the deployment specific sections for more details and restrictions. Note that
spark.task.resource.gpu.amount can be a decimal amount, so if you want multiple tasks to be run on an executor at the same time and assigned to the same GPU you can set this to a decimal value less than 1. You would want this setting to correspond to the
spark.executor.cores setting. For instance, if you have
spark.executor.cores=2 which would allow 2 tasks to run on each executor and you want those 2 tasks to run on the same GPU then you would set
spark.task.resource.gpu.amount=0.5. See the Tuning Guide for more details on controlling the task concurrency for each executor.
You can also refer to the official Apache Spark documentation.
- Kubernetes specific documentation
- Yarn specific documentation
- Standalone specific documentation
If you plan to convert existing Spark workload from CPU to GPU, please refer to this Spark workload qualification to check if your Spark Applications are good fit for the RAPIDS Accelerator for Apache Spark.
Please visit spark-rapids-benchmarks repo for benchmark tests using the RAPIDS Accelerator For Apache Spark, if you plan to compare the CPU and GPU Spark jobs’ performance.