Kubernetes with GPUs¶
In this example, we'll adapt elements of the HA Kubernetes example and the GPU Ollama setup to work together, so you can launch Pods with GPU acceleration.
Set up a config¶
Create the k3s-gpu.yaml
file as below:
config:
host_groups:
- name: gpu
storage: image
storage_size: 30G
count: 1
vcpu: 4
ram_gb: 16
gpu_count: 1
network:
bridge: brgpu0
tap_prefix: gputap
gateway: 192.168.139.1/24
github_user: alexellis
kernel_image: "ghcr.io/openfaasltd/actuated-kernel-ch:5.10.240-x86_64-latest"
image: "ghcr.io/openfaasltd/slicer-systemd-ch:5.10.240-x86_64-latest"
hypervisor: cloud-hypervisor
Feel free to customise the vCPU, RAM, and disk sizes.
Boot up the VM:
sudo -E slicer up ./k3s-gpu.yaml
Now, run the route commands so you can SSH into the host from your workstation.
Next, log into each VM via SSH:
ssh ubuntu@192.168.139.2
Next, install K3s using K3sup Pro or K3sup CE.
For CE:
k3sup install --host 192.168.139.2 --user ubuntu
You'll get a kubeconfig returned, run the commands to export it so kubectl uses it.
Install the Nvidia driver:
curl -SLsO https://raw.githubusercontent.com/self-actuated/nvidia-run/refs/heads/master/setup-nvidia-run.sh
chmod +x ./setup-nvidia-run.sh
sudo bash ./setup-nvidia-run.sh
Install the Nvidia Container Toolkit using the official instructions. Use the instructions for "apt: Ubuntu, Debian".
Confirm that the nvidia container runtime has been found by K3s:
sudo grep nvidia /var/lib/rancher/k3s/agent/etc/containerd/config.toml
Apply a new runtime class for Nvidia:
cat > nvidia-runtime.yaml <<EOF
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: nvidia
handler: nvidia
EOF
kubectl create -f nvidia-runtime.yaml
Run a test Pod to show the output from nvidia-smi:
cat > nvidia-smi-pod.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
name: nvidia-smi
spec:
runtimeClassName: nvidia
restartPolicy: OnFailure
containers:
- name: nvidia-smi
image: nvidia/cuda:12.1.0-base-ubuntu22.04
command: ['sh', '-c', "nvidia-smi"]
EOF
kubectl create -f nvidia-smi-pod.yaml
For the time-being, this Pod uses only the runtimeClassName
to request a GPU. Adding the usual limits
section as below, does not work at present, and may require additional configuration in K3s or containerd:
+ resources:
+ limits:
+ nvidia.com/gpu: "1"
Fetch the logs:
kubectl logs nvidia-smi
$ kubectl logs pod/nvidia-smi
Mon Sep 1 15:04:11 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.76.05 Driver Version: 580.76.05 CUDA Version: 13.0 |
+-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3090 Off | 00000000:00:07.0 Off | N/A |
| 30% 46C P0 110W / 350W | 0MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
Enable Device Plugin¶
Install Nvidia's Device Plugin for Kubernetes:
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.17.1/deployments/static/nvidia-device-plugin.yml
Patch it so it works with K3s:
# add runtimeClassName: nvidia to the DS pod spec
kubectl -n kube-system patch ds nvidia-device-plugin-daemonset \
--type='json' \
-p='[{"op":"add","path":"/spec/template/spec/runtimeClassName","value":"nvidia"}]'
kubectl -n kube-system rollout status ds/nvidia-device-plugin-daemonset -n kube-system
Then run the Pod from earlier, but with the limits
in place:
cat > nvidia-smi-pod.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
name: nvidia-smi
spec:
runtimeClassName: nvidia
restartPolicy: OnFailure
containers:
- name: nvidia-smi
image: nvidia/cuda:12.1.0-base-ubuntu22.04
command: ['sh', '-c', "nvidia-smi"]
resources:
limits:
nvidia.com/gpu: "1"
EOF
kubectl create -f nvidia-smi-pod.yaml