Run a Sample Workload#
After completing the deployment steps, you can verify your installation by running a sample GPU workload in a confidential container.
A pod manifest for a confidential container GPU workload requires that you specify the kata-qemu-nvidia-gpu-snp runtime class for SEV-SNP or kata-qemu-nvidia-gpu-tdx for TDX.
Create a file, such as the following
cuda-vectoradd-kata.yamlsample, specifying the kata-qemu-nvidia-gpu-snp runtime class:apiVersion: v1 kind: Pod metadata: name: cuda-vectoradd-kata namespace: default spec: runtimeClassName: kata-qemu-nvidia-gpu-snp restartPolicy: Never containers: - name: cuda-vectoradd image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04" resources: limits: nvidia.com/pgpu: "1" memory: 16Gi
The following are Confidential Containers configurations in the sample manifest:
Set the runtime class to
kata-qemu-nvidia-gpu-snpfor SEV-SNP orkata-qemu-nvidia-gpu-tdxfor TDX, depending on the node type where the workloads should run.In the sample above,
nvidia.com/pgpuis the default resource type for GPUs. If you are deploying on a heterogeneous cluster, you might want to update the default behavior by specifying theP_GPU_ALIASenvironment variable for the sandbox device plugin. Refer to the Configuring the Sandbox Device Plugin to Use GPU or NVSwitch Specific Resource Types for more details.If you have machines that support multi-GPU passthrough, refer to the Configuring Multi-GPU Passthrough page for a complete workload example and architecture-specific CC mode requirements.
Create the pod:
$ kubectl apply -f cuda-vectoradd-kata.yamlExample Output:
pod/cuda-vectoradd-kata createdOptional: Verify the pod is running.
$ kubectl get pod cuda-vectoradd-kataExample Output:
NAME READY STATUS RESTARTS AGE cuda-vectoradd-kata 1/1 Running 0 10s
View the logs from the pod after the container starts:
$ kubectl logs -n default cuda-vectoradd-kataExample Output:
[Vector addition of 50000 elements] Copy input data from the host memory to the CUDA device CUDA kernel launch with 196 blocks of 256 threads Copy output data from the CUDA device to the host memory Test PASSED Done
Delete the pod:
$ kubectl delete -f cuda-vectoradd-kata.yaml
Next Steps#
Configure Attestation with the Trustee framework to enable remote verification of your confidential environment.
Set up multi-GPU passthrough for NVSwitch-based HGX systems.
Tune image pull timeouts if you are pulling large container images.
Manage the confidential computing mode on your GPUs.