Configuring Multi-GPU Passthrough Support#

Multi-GPU passthrough assigns all GPUs and NVSwitches on a node to a single Confidential Container virtual machine. This configuration is required for NVSwitch (NVLink) based HGX systems running confidential workloads.

You must assign all the GPUs and NVSwitches on the node to the same Confidential Container virtual machine. Configuring only a subset of GPUs for Confidential Computing on a single node is not supported.

Prerequisites#

Set the Confidential Computing Mode#

The required CC mode depends on your GPU architecture.

Set the NODE_NAME environment variable to the name of the node you want to configure:

$ export NODE_NAME="<node-name>"

NVIDIA Hopper architecture:

Multi-GPU passthrough on Hopper uses protected PCIe (PPCIE), which claims exclusive use of the NVSwitches for a single Confidential Container. Set the node’s CC mode to ppcie:

$ kubectl label node $NODE_NAME nvidia.com/cc.mode=ppcie --overwrite

NVIDIA Blackwell architecture:

The Blackwell architecture uses NVLink encryption which places the switches outside of the Trusted Computing Base (TCB). The ppcie mode is not required. Use on mode:

$ kubectl label node $NODE_NAME nvidia.com/cc.mode=on --overwrite

Refer to Managing the Confidential Computing Mode for details on verifying the mode change.

Run a Multi-GPU Workload#

  1. Create a file, such as multi-gpu-kata.yaml, with a pod manifest that requests all GPUs and NVSwitches on the node:

    apiVersion: v1
    kind: Pod
    metadata:
      name: multi-gpu-kata
      namespace: default
    spec:
      runtimeClassName: kata-qemu-nvidia-gpu-snp
      restartPolicy: Never
      containers:
        - name: cuda-sample
          image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04"
          resources:
            limits:
              nvidia.com/pgpu: "8"
              nvidia.com/nvswitch: "4"
              memory: 128Gi
    

    Set the runtime class to kata-qemu-nvidia-gpu-snp for SEV-SNP or kata-qemu-nvidia-gpu-tdx for TDX, depending on the node type.

    Note

    If you configured P_GPU_ALIAS for heterogeneous clusters, replace nvidia.com/pgpu with the model-specific resource type. Refer to Configuring the Sandbox Device Plugin to Use GPU or NVSwitch Specific Resource Types for details.

  2. Create the pod:

    $ kubectl apply -f multi-gpu-kata.yaml
    

    Example Output:

    pod/multi-gpu-kata created
    
  3. Verify the pod is running:

    $ kubectl get pod multi-gpu-kata
    

    Example Output:

    NAME             READY   STATUS    RESTARTS   AGE
    multi-gpu-kata   1/1     Running   0          30s
    
  4. Verify that all GPUs are visible inside the container:

    $ kubectl exec multi-gpu-kata -- nvidia-smi -L
    

    Example Output:

    GPU 0: NVIDIA H100 (UUID: GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
    GPU 1: NVIDIA H100 (UUID: GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
    GPU 2: NVIDIA H100 (UUID: GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
    GPU 3: NVIDIA H100 (UUID: GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
    GPU 4: NVIDIA H100 (UUID: GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
    GPU 5: NVIDIA H100 (UUID: GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
    GPU 6: NVIDIA H100 (UUID: GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
    GPU 7: NVIDIA H100 (UUID: GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
    
  5. Delete the pod:

    $ kubectl delete -f multi-gpu-kata.yaml