On-Premise#

Create a Kubernetes cluster to be used as a backend for job execution on your own infrastructure.

Prerequisites#

Before setting up your on-premises Kubernetes cluster, ensure you have the necessary hardware and software infrastructure available.

Important

  • Kubernetes Version: v1.30.0 or later

  • Architecture: x86_64 or arm64

  • Container Runtime: containerd 1.7.27+

  • Networking: All nodes must be routable with stable IP addresses, DNS resolution, and outbound internet access to container registries and the OSMO service

  • Security: Enable encryption at rest for etcd and persistent volumes, TLS for communications - Encrypt Data

Node Requirements#

Node Type

Specification

Configuration

Control Plane

  • 12 cores minimum

  • 24 GB RAM minimum

  • 200 GB disk minimum

  • High availability recommended for production (3+ control plane nodes)

  • Ubuntu 22.04+ or equivalent enterprise Linux

Backend-Operator

  • 4 cores minimum

  • 8 GB RAM minimum

  • 50 GB disk minimum

  • Dedicated nodes for osmo-backend-operator

  • Auto-scaling: 1-3 nodes recommended

  • Label: node-type=operator

  • Optional taints to isolate operator workloads

Compute (CPU)

  • Size per workload needs

  • Configure auto-scaling

  • Label: node-type=compute

  • Optional taints to isolate workflow pods

Compute (GPU x86_64)

  • Size per workload needs

  • NVIDIA Driver 535.216.03+

  • CUDA 12.6+

Compute (GPU Jetson)

  • JetPack 6.2+

  • Includes CUDA 12.6

Setup Guide#

Kubeadm (Upstream Kubernetes)

Kubeadm is the official tool for bootstrapping Kubernetes clusters and provides full control over cluster configuration.

Documentation:

Key Steps:

  1. Install containerd 1.7.27+ as the container runtime

  2. Configure control plane and worker nodes per specifications above

  3. Apply appropriate node labels (node-type=operator, node-type=compute, node-type=gpu, node-type=jetson)

  4. For GPU nodes (x86_64): Install NVIDIA drivers (535.216.03+), CUDA (12.6+), and NVIDIA Container Toolkit

  5. For GPU nodes (Jetson): Install JetPack 6.2+ (includes CUDA and container runtime)

  6. Configure CNI plugin and verify CoreDNS is operational

  7. Configure firewall rules per Kubernetes Ports and Protocols

Note

For production deployments, configure high availability with 3+ control plane nodes and enable encryption at rest for etcd.