Prerequisites#

The following prerequisites are required to configure your cluster to deploy Confidential Containers.

Refer to the Supported Platforms page for validated hardware and software versions.

Hardware and BIOS#

  • Use a supported platform configured for Confidential Computing. For more information on machine setup, refer to Supported Platforms.

  • Ensure hosts are configured to enable hardware virtualization and Access Control Services (ACS). With some AMD CPUs and BIOSes, ACS might be grouped under Advanced Error Reporting (AER). Enable these features in the host BIOS.

  • Configure hosts to support IOMMU. You can check if your host is configured for IOMMU by running the following command:

    $ ls /sys/kernel/iommu_groups
    

    If the output of this command includes 0, 1, and so on, then your host is configured for IOMMU.

    If the host is not configured or if you are unsure, add the amd_iommu=on Linux kernel command-line argument for AMD CPUs, or intel_iommu=on for Intel CPUs. For most Linux distributions, add the argument to the /etc/default/grub file, for instance:

    ...
    GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on modprobe.blacklist=nouveau"
    ...
    

    After making the change, configure the bootloader.

    $ sudo update-grub
    

    Example Output:

    Sourcing file `/etc/default/grub'
    Generating grub configuration file ...
    Found linux image: /boot/vmlinuz-5.15.0-generic
    Found initrd image: /boot/initrd.img-5.15.0-generic
    done
    

    Reboot the host after configuring the bootloader.

    Note

    After configuring IOMMU, you might see QEMU warnings about PCI P2P DMA when running GPU workloads. These are expected and can be safely ignored. Refer to Limitations and Restrictions for details.

  • Ensure that no NVIDIA GPU drivers are installed on the host. Confidential Containers uses VFIO to pass GPUs directly to the confidential VM, and host-level GPU drivers interfere with VFIO device binding.

    To check if NVIDIA GPU drivers are installed, run the following command:

    $ lsmod | grep nvidia
    

    If the output is empty, no NVIDIA GPU drivers are loaded. If modules such as nvidia, nvidia_uvm, or nvidia_modeset are listed, NVIDIA GPU drivers are present and must be removed before proceeding. Refer to Removing the Driver in the NVIDIA Driver Installation Guide.

Kubernetes Cluster#

  • A Kubernetes cluster with cluster administrator privileges. Refer to the Supported Software Components table for supported Kubernetes versions.

  • containerd version 2.2.2 installed. Refer to the containerd Getting Started guide for installation instructions.

    To verify the installed version, run the following command:

    $ containerd --version
    

    Example Output:

    containerd containerd.io 2.2.2 ...
    
  • Helm installed. Use the command below to install Helm or refer to the Helm documentation for installation instructions.

    $ curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 \
          && chmod 700 get_helm.sh \
          && ./get_helm.sh
    
  • Enable the KubeletPodResourcesGet and RuntimeClassInImageCriApi Kubelet feature gates on your cluster.

    • KubeletPodResourcesGet: Enabled by default on Kubernetes v1.34 and later. On older versions, you must enable it explicitly. The Kata runtime uses this feature gate to query the Kubelet Pod Resources API and discover allocated GPU devices during sandbox creation.

    • RuntimeClassInImageCriApi: Alpha since Kubernetes v1.29 and is not enabled by default. This feature gate is required to support pod deployments that use multiple snapshotters side-by-side.

    Add both feature gates to your Kubelet configuration (typically /var/lib/kubelet/config.yaml):

    apiVersion: kubelet.config.k8s.io/v1beta1
    kind: KubeletConfiguration
    featureGates:
      KubeletPodResourcesGet: true
      RuntimeClassInImageCriApi: true
    

    If your config.yaml already has a featureGates section, add the gates to the existing section rather than creating a duplicate.

    Restart the Kubelet service to apply the changes:

    $ sudo systemctl restart kubelet
    
  • Configure image pull timeouts. The guest-pull mechanism pulls images inside the confidential VM, which means large images can take longer to download and delay container start. Kubelet can de-allocate your pod if the image pull exceeds the configured timeout before the container transitions to the running state.

    If you plan to use large images, increase runtimeRequestTimeout in your kubelet configuration to 20m to match the default values for the NVIDIA shim configurations in Kata Containers.

    Add or update the runtimeRequestTimeout field in your kubelet configuration (typically /var/lib/kubelet/config.yaml):

    apiVersion: kubelet.config.k8s.io/v1beta1
    kind: KubeletConfiguration
    runtimeRequestTimeout: 20m
    

    Restart the kubelet service to apply the change:

    $ sudo systemctl restart kubelet
    

    Optionally, you can configure additional timeouts for the NVIDIA Shim and Kata Agent Policy. The NVIDIA shim configurations in Kata Containers use a default create_container_timeout of 1200 seconds (20 minutes). This controls the time the shim allows for a container to remain in container creating state. If you need a timeout of more than 1200 seconds, you will also need to adjust Kata Agent Policy’s image_pull_timeout value which controls the agent-side timeout for guest-image pull. To do this, add the agent.image_pull_timeout kernel parameter to your shim configuration, or pass an explicit value in a pod annotation in the io.katacontainers.config.hypervisor.kernel_params: "..." annotation.

Next Steps#

After completing the prerequisites, proceed to Deploy Confidential Containers.