Prerequisites#
The following prerequisites are required to configure your cluster to deploy Confidential Containers.
Refer to the Supported Platforms page for validated hardware and software versions.
Hardware and BIOS#
Use a supported platform configured for Confidential Computing. For more information on machine setup, refer to Supported Platforms.
Ensure hosts are configured to enable hardware virtualization and Access Control Services (ACS). With some AMD CPUs and BIOSes, ACS might be grouped under Advanced Error Reporting (AER). Enable these features in the host BIOS.
Configure hosts to support IOMMU. You can check if your host is configured for IOMMU by running the following command:
$ ls /sys/kernel/iommu_groupsIf the output of this command includes 0, 1, and so on, then your host is configured for IOMMU.
If the host is not configured or if you are unsure, add the
amd_iommu=onLinux kernel command-line argument for AMD CPUs, orintel_iommu=onfor Intel CPUs. For most Linux distributions, add the argument to the/etc/default/grubfile, for instance:... GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on modprobe.blacklist=nouveau" ...
After making the change, configure the bootloader.
$ sudo update-grubExample Output:
Sourcing file `/etc/default/grub' Generating grub configuration file ... Found linux image: /boot/vmlinuz-5.15.0-generic Found initrd image: /boot/initrd.img-5.15.0-generic done
Reboot the host after configuring the bootloader.
Note
After configuring IOMMU, you might see QEMU warnings about PCI P2P DMA when running GPU workloads. These are expected and can be safely ignored. Refer to Limitations and Restrictions for details.
Ensure that no NVIDIA GPU drivers are installed on the host. Confidential Containers uses VFIO to pass GPUs directly to the confidential VM, and host-level GPU drivers interfere with VFIO device binding.
To check if NVIDIA GPU drivers are installed, run the following command:
$ lsmod | grep nvidia
If the output is empty, no NVIDIA GPU drivers are loaded. If modules such as
nvidia,nvidia_uvm, ornvidia_modesetare listed, NVIDIA GPU drivers are present and must be removed before proceeding. Refer to Removing the Driver in the NVIDIA Driver Installation Guide.
Kubernetes Cluster#
A Kubernetes cluster with cluster administrator privileges. Refer to the Supported Software Components table for supported Kubernetes versions.
containerd version 2.2.2 installed. Refer to the containerd Getting Started guide for installation instructions.
To verify the installed version, run the following command:
$ containerd --versionExample Output:
containerd containerd.io 2.2.2 ...Helm installed. Use the command below to install Helm or refer to the Helm documentation for installation instructions.
$ curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 \ && chmod 700 get_helm.sh \ && ./get_helm.sh
Enable the
KubeletPodResourcesGetandRuntimeClassInImageCriApiKubelet feature gates on your cluster.KubeletPodResourcesGet: Enabled by default on Kubernetes v1.34 and later. On older versions, you must enable it explicitly. The Kata runtime uses this feature gate to query the Kubelet Pod Resources API and discover allocated GPU devices during sandbox creation.RuntimeClassInImageCriApi: Alpha since Kubernetes v1.29 and is not enabled by default. This feature gate is required to support pod deployments that use multiple snapshotters side-by-side.
Add both feature gates to your Kubelet configuration (typically
/var/lib/kubelet/config.yaml):apiVersion: kubelet.config.k8s.io/v1beta1 kind: KubeletConfiguration featureGates: KubeletPodResourcesGet: true RuntimeClassInImageCriApi: true
If your
config.yamlalready has afeatureGatessection, add the gates to the existing section rather than creating a duplicate.Restart the Kubelet service to apply the changes:
$ sudo systemctl restart kubeletConfigure image pull timeouts. The guest-pull mechanism pulls images inside the confidential VM, which means large images can take longer to download and delay container start. Kubelet can de-allocate your pod if the image pull exceeds the configured timeout before the container transitions to the running state.
If you plan to use large images, increase
runtimeRequestTimeoutin your kubelet configuration to20mto match the default values for the NVIDIA shim configurations in Kata Containers.Add or update the
runtimeRequestTimeoutfield in your kubelet configuration (typically/var/lib/kubelet/config.yaml):apiVersion: kubelet.config.k8s.io/v1beta1 kind: KubeletConfiguration runtimeRequestTimeout: 20m
Restart the kubelet service to apply the change:
$ sudo systemctl restart kubeletOptionally, you can configure additional timeouts for the NVIDIA Shim and Kata Agent Policy. The NVIDIA shim configurations in Kata Containers use a default
create_container_timeoutof 1200 seconds (20 minutes). This controls the time the shim allows for a container to remain in container creating state. If you need a timeout of more than 1200 seconds, you will also need to adjust Kata Agent Policy’simage_pull_timeoutvalue which controls the agent-side timeout for guest-image pull. To do this, add theagent.image_pull_timeoutkernel parameter to your shim configuration, or pass an explicit value in a pod annotation in theio.katacontainers.config.hypervisor.kernel_params: "..."annotation.
Next Steps#
After completing the prerequisites, proceed to Deploy Confidential Containers.