Setting up a Microk8s environment

Microk8s is a lightweight Kubernetes distribution that is suitable for setting up a local environment for development and testing purposes. These instructions explain how to set up Microk8s on Ubuntu.

Install Microk8s

It is important to install the 1.20 version because there are some issues in the 1.21 release with GPU support. These may have been resolved in the 1.22 release but this has not been tested yet with the RAPIDS Accelerator.

sudo snap install microk8s --classic --channel=1.20/stable

Permissions

To avoid the need to use sudo when running microk8s commands, add the current user to the microk8s group and ensure that the user has access to files in the ~/.kube folder.

sudo usermod -a -G microk8s $USER
sudo chown -f -R $USER ~/.kube

Generate Kube config

Backup any existing Kube configuration file and then generate a new kube config.

mkdir -p ~/.kube
microk8s config > ~/.kube/config

Enable DNS and GPU support

microk8s.enable dns
microk8s.enable gpu

Add cluster host name to /etc/hosts

Add kubernetes.default.svc.cluster.local to /etc/hosts to map to the same IP address as the host name. This is usually 127.0.1.1 or 127.0.0.1.

This host name should be used in the spark-submit command by specifying --master k8s://https://kubernetes.default.svc.cluster.local:16443.

Create Spark service account

microk8s.kubectl create serviceaccount spark

This service account can then be specified in the spark-submit command with --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark.

Specifying a certificate

The Microk8s certificate can be specified with --conf spark.kubernetes.authenticate.caCertFile=/var/snap/microk8s/current/certs/ca.crt.

set K8S_SECRET env var with token

View a list of secrets using the following command.

microk8s.kubectl -n kube-system get secrets

Look for a secret with a name starting with default-token- and run the following command to view the secret.

microk8s.kubectl -n kube-system describe secret default-token-5bmbh

Copy the value of the token attribute and use it to set a K8S_TOKEN environment variable.

export K8S_TOKEN=<value from above>

This token can then be specified in the spark-submit command with --conf spark.kubernetes.authenticate.submission.oauthToken=$K8S_TOKEN

Building and exporting Docker images

Follow the instructions in Getting Started with RAPIDS and Kubernetes to create Docker images containing Spark and the RAPIDS Accelerator for Apache Spark.

Note that an additional step is required to export the Docker images from the host and import them into the Microk8s cluster.

For example, the following commands can be used to export a Docker image named spark-rapids and import it into Microk8s.

docker save spark-rapids > /tmp/spark-rapids.tar
microk8s ctr image import /tmp/spark-rapids.tar

Spark Submit

After completing the above steps it should now be possible to use spark-submit to run a Spark job in Microk8s. The following partial spark-submit example shows the settings that should be present.

$SPARK_HOME/bin/spark-submit \
    --master k8s://https://kubernetes.default.svc.cluster.local:16443 \
    --deploy-mode cluster \
    --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
    --conf spark.kubernetes.authenticate.caCertFile=/var/snap/microk8s/current/certs/ca.crt \
    --conf spark.kubernetes.authenticate.submission.oauthToken=$K8S_TOKEN \
    --conf spark.kubernetes.container.image=spark-rapids \

Further Reading

The following documentation pages and blog posts provide more information on setting up Microk8s, Apache Spark, and RAPIDS.