Quickstart ========== **Note that the features described below are currently beta**. .. contents:: Table of Contents :local: Description and Requirements ---------------------------- NVIDIA driver container images are available through the `NVIDIA public hub repository `_. It allows the provisioning of the NVIDIA driver through the use of containers which provides several benefits over a standard driver installation, for example: * Ease of deployment * Fast installation * Reproducibility For more information about its internals, check out this `presentation `_. The list of prerequisites for running a driver container is described below. #. Ubuntu 16.04, Ubuntu 18.04 or Centos 7 with the IPMI driver enabled and the Nouveau driver disabled #. NVIDIA GPU with Architecture > Fermi (2.1) #. A `supported version of Docker `_ #. The `NVIDIA Container Runtime for Docker `_\ ) configured with the ``root`` option #. If you are running Ubuntu 18.04 with an AWS kernel, you also need to enable the ``i2c_core`` kernel module Configuration ------------- You will need to update the NVIDIA Container Toolkit config file so that the ``root`` directive points to the driver container: .. code-block:: disable-require = false #swarm-resource = "DOCKER_RESOURCE_GPU" [nvidia-container-cli] root = "/run/nvidia/driver" #path = "/usr/bin/nvidia-container-cli" environment = [] #debug = "/var/log/nvidia-container-toolkit.log" #ldcache = "/etc/ld.so.cache" load-kmods = true #no-cgroups = false #user = "root:video" ldconfig = "@/sbin/ldconfig.real" [nvidia-container-runtime] #debug = "/var/log/nvidia-container-runtime.log" Examples -------- .. code-block:: sh # Run the driver container for Ubuntu 16.04 LTS in interactive mode docker run -it --name nvidia-driver --privileged --pid=host -v /run/nvidia:/run/nvidia:shared \ nvidia/driver:396.37-ubuntu16.04 # Run the driver container for Ubuntu 16.04 AWS in detached mode docker run -d --name nvidia-driver --privileged --pid=host -v /run/nvidia:/run/nvidia:shared \ nvidia/driver:396.37-ubuntu16.04-aws --accept-license # Run the driver container for Ubuntu 16.04 HWE in detached mode with # auto-restarts and auto-detection of kernel updates (aka DKMS) docker run -d --name nvidia-driver --privileged --pid=host -v /run/nvidia:/run/nvidia:shared \ --restart=unless-stopped -v /etc/kernel/postinst.d:/run/kernel/postinst.d \ nvidia/driver:396.37-ubuntu16.04-hwe --accept-license # Run the driver container for Centos 7 in detached mode and check its logs docker run -d --name nvidia-driver --privileged --pid=host -v /run/nvidia:/run/nvidia:shared \ nvidia/driver:396.37-centos7 --accept-license docker logs -f nvidia-driver # Build a custom driver container image for Centos 7 with the current kernel docker build -t nvidia-driver:centos7 --build-arg KERNEL_VERSION=$(uname -r) \ https://gitlab.com/nvidia/driver.git#centos7 # Perform a driver update ahead of time for a given kernel version docker exec nvidia-driver nvidia-driver update --kernel 4.15.0-23 Quickstart ---------- Ubuntu Distributions ~~~~~~~~~~~~~~~~~~~~ .. code-block:: curl https://get.docker.com | sudo CHANNEL=stable sh distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list \ | sudo tee /etc/apt/sources.list.d/nvidia-docker.list sudo apt-get update && sudo apt-get install -y nvidia-docker2 sudo sed -i 's/^#root/root/' /etc/nvidia-container-runtime/config.toml sudo tee /etc/modules-load.d/ipmi.conf <<< "ipmi_msghandler" sudo tee /etc/modprobe.d/blacklist-nouveau.conf <<< "blacklist nouveau" sudo tee -a /etc/modprobe.d/blacklist-nouveau.conf <<< "options nouveau modeset=0" # If you are running with an AWS kernel sudo tee /etc/modules-load.d/ipmi.conf <<< "i2c_core" sudo update-initramfs -u # Optionally, if the kernel is not up to date # sudo apt-get dist-upgrade sudo reboot sudo docker run -d --privileged --pid=host -v /run/nvidia:/run/nvidia:shared \ --restart=unless-stopped nvidia/driver:418.40.04-ubuntu18.04 --accept-license sudo docker run --rm --runtime=nvidia nvidia/cuda:9.2-base nvidia-smi Centos Distributions ~~~~~~~~~~~~~~~~~~~~ .. code-block:: curl https://get.docker.com | sudo CHANNEL=stable sh sudo systemctl enable docker curl -s -L https://nvidia.github.io/nvidia-docker/centos7/nvidia-docker.repo \ | sudo tee /etc/yum.repos.d/nvidia-docker.repo sudo yum install -y nvidia-docker2 sudo sed -i 's/^#root/root/' /etc/nvidia-container-runtime/config.toml sudo tee /etc/modules-load.d/ipmi.conf <<< "ipmi_msghandler" sudo tee /etc/modprobe.d/blacklist-nouveau.conf <<< "blacklist nouveau" sudo tee -a /etc/modprobe.d/blacklist-nouveau.conf <<< "options nouveau modeset=0" # Optionally, if the kernel is not up to date # sudo yum update sudo reboot sudo docker run -d --privileged --pid=host -v /run/nvidia:/run/nvidia:shared \ --restart=unless-stopped nvidia/driver:396.37-centos7 --accept-license sudo docker run --rm --runtime=nvidia nvidia/cuda:9.2-base nvidia-smi Kubernetes with dockerd ----------------------- Install ``nvidia-docker2`` and modify ``/etc/nvidia-container-runtime/config.toml`` as mentioned above. You also need to set the default docker runtime to `\ ``nvidia`` `_. .. code-block:: # If running on bare-metal kubectl create -f https://gitlab.com/nvidia/samples/raw/master/driver/ubuntu16.04/kubernetes/nvidia-driver.yml # If running on AWS kubectl create -f https://gitlab.com/nvidia/samples/raw/master/driver/ubuntu16.04/kubernetes/nvidia-driver-aws.yml You can now deploy the `NVIDIA device plugin `_. Deleting the pod will unload the NVIDIA driver from the machine: .. code-block:: kubectl delete daemonset.apps/nvidia-driver-daemonset Tags available -------------- Check the `DockerHub `_