homelab/k3s/README.md
Samantha Atkins 759ef949bc K3s cluster on Proxmox with WireGuard mesh networking
Replaced Headscale (too buggy in 0.28.x — random node drops) with direct
WireGuard hub-and-spoke + full mesh. 7 Proxmox VMs across 3 hosts form a
K3s v1.34.6 cluster: 3 control-plane/etcd nodes, 4 workers.

Running services: postgres, mariadb, ghost (x3), forgejo, authentik.
All unpinned services use local-path StorageClass. Databases pinned to
pve-worker and adder-worker with local PVs.

Includes VM provisioning scripts (create-debian-template.sh, clone-vm.sh),
K3s manifests for all services, and full deployment docs in k3s/README.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 01:23:13 -04:00

7.6 KiB
Raw Blame History

K3s Cluster — Setup & Deployment Notes

This is the production cluster running on Proxmox VMs, connected via WireGuard hub-and-spoke. The VirtualBox learning cluster this replaced is retired.


WireGuard Mesh — Node Assignments

Hub: DO droplet at 138.197.87.251:51820, WG IP 10.0.0.1/24

Node vmbr1 IP WG IP Proxmox Host
pve-control 10.10.10.151 10.0.0.6 pve
pve-worker 10.10.10.126 10.0.0.7 pve
adder-control 10.10.10.185 10.0.0.8 adder
adder-worker 10.10.10.83 10.0.0.9 adder
game-control 10.10.10.158 10.0.0.10 game
game-worker-hdd 10.10.10.186 10.0.0.11 game
game-worker-ssd 10.10.10.153 10.0.0.12 game

IPs 10.0.0.210.0.0.5 are reserved (old VirtualBox K3s nodes, leave alone).

All VMs are Debian Trixie on vmbr1 (10.10.10.0/24). Inter-node traffic runs over WireGuard (10.0.0.0/24).


K3s Install

Prerequisites — each VM must be on the WireGuard mesh first

WireGuard is configured via wg0.conf on each node (hub-and-spoke through DO droplet). Verify connectivity: ping 10.0.0.1 from the node.

First control plane node (cluster init)

curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--cluster-init --disable traefik \
  --node-ip=<10.0.0.x> --flannel-iface=wg0" sh -

# Get token for other nodes to join
sudo cat /var/lib/rancher/k3s/server/node-token

Second and third control plane nodes

curl -sfL https://get.k3s.io | K3S_URL=https://<control-1-mesh-ip>:6443 K3S_TOKEN=<token> \
  INSTALL_K3S_EXEC="--server https://<control-1-mesh-ip>:6443 --disable traefik \
  --node-ip=<this-node-mesh-ip> --flannel-iface=wg0" sh -

Note: use --server not just K3S_URL — this is what makes it a control plane peer, not a worker. etcd requires odd numbers — 3 control nodes tolerates 1 failure. Never stop at 2.

Workers

curl -sfL https://get.k3s.io | K3S_URL=https://<any-control-mesh-ip>:6443 K3S_TOKEN=<token> \
  INSTALL_K3S_EXEC="--node-ip=<this-node-mesh-ip> --flannel-iface=wg0" sh -

kubeconfig for normal user (on any control node)

mkdir -p ~/.kube
sudo cp /etc/rancher/k3s/k3s.yaml ~/.kube/config
sudo chown samantha:samantha ~/.kube/config
export KUBECONFIG=~/.kube/config   # also add to ~/.bashrc
# Update server IP in config if needed:
sed -i 's/127.0.0.1/<control-1-mesh-ip>/' ~/.kube/config

Label workers

kubectl label node <name> node-role.kubernetes.io/worker=worker

GPU Worker Nodes — adder and game

Both Proxmox hosts adder and game have RTX 2070 GPUs available for PCIe passthrough.

Proxmox PCIe passthrough setup (on each Proxmox host)

# Enable IOMMU in /etc/default/grub:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
# (use amd_iommu=on for AMD hosts)
update-grub
reboot

# Blacklist nvidia drivers on host so GPU is free for passthrough:
echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf
echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf
update-initramfs -u
reboot

In Proxmox UI: VM Hardware → Add → PCI Device → select the RTX 2070 → check "All Functions" and "Primary GPU" if it is the only GPU.

Inside the GPU worker VM — install NVIDIA drivers

apt-get install -y linux-headers-$(uname -r)
# Add non-free repo if needed:
apt-get install -y nvidia-driver firmware-misc-nonfree
reboot
# Verify:
nvidia-smi

Install NVIDIA device plugin in K3s

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.14.0/nvidia-device-plugin.yml

Label GPU nodes

kubectl label node k3s-adder nvidia.com/gpu=true
kubectl label node k3s-game nvidia.com/gpu=true

Verify GPU is schedulable

kubectl get nodes -o json | jq '.items[].status.capacity'
# Should show nvidia.com/gpu: "1" on adder and game

Scheduling a workload to a GPU node

resources:
  limits:
    nvidia.com/gpu: 1

Namespaces — one per venture

kubectl create namespace sjasoft
kubectl create namespace fulfillment
kubectl create namespace privacy-practice

Secrets are always created per namespace — never share secrets across namespaces.


Secrets

Never stored in files with real values. Always create directly on a control node.

# Pattern — adapt per service and namespace
kubectl create secret generic <name> \
  --namespace <namespace> \
  --from-literal=<key>='<value>'

# Generate passwords with:
openssl rand -base64 24

NodePort Registry

NodePorts must be unique across the entire cluster (range 30000-32767). Any NodePort is reachable on any node's WireGuard IP — K3s routes internally. Caddy on each venture ingress VPS proxies to any node's WG IP + NodePort.

Port Service Notes
32368 ghost1 blog.the-fulfillment.org
32369 ghost2 blog.privacy-practice.com
32370 ghost3 blog.sjasoft.com
32371 forgejo git.sjasoft.com
32372 authentik (HTTP) auth.sjasoft.com — use this behind Caddy
32373 authentik (HTTPS) skip — Caddy handles TLS

Caddy Pattern — venture ingress VPS

Each venture has its own ingress VPS with its own public IP. Caddy on each proxies to a different node's mesh IP for the same cluster — ventures look unrelated from outside.

# Example — any node's WG IP works for any NodePort
blog.the-fulfillment.org {
    reverse_proxy 10.0.0.6:32368
}

git.sjasoft.com {
    reverse_proxy 10.0.0.8:32371
}

auth.sjasoft.com {
    reverse_proxy 10.0.0.10:32372
}

Pick any node's WG IP per service — they all work. Use different nodes per venture so ventures look unrelated from outside. See the WireGuard mesh table above for IPs.


Current Deployment Status (2026-04-07)

K3s v1.34.6 cluster fully operational. WireGuard full mesh (direct peer-to-peer over vmbr1, hub for external traffic). Headscale removed — too buggy (0.28.x dropped nodes randomly).

Cluster Nodes

Node Role WG IP Proxmox Host Resources
pve-control control-plane, etcd 10.0.0.6 pve 2 CPU, 2GB RAM, 20GB
pve-worker worker 10.0.0.7 pve 8 CPU, 58GB RAM, 3.3TB
adder-control control-plane, etcd 10.0.0.8 adder 2 CPU, 2GB RAM, 20GB
adder-worker worker 10.0.0.9 adder 10 CPU, 58GB RAM, 1.7TB
game-control control-plane, etcd 10.0.0.10 game 2 CPU, 2GB RAM, 20GB
game-worker-hdd worker 10.0.0.11 game 4 CPU, 6GB RAM, 1.4TB HDD
game-worker-ssd worker 10.0.0.12 game 10 CPU, 8GB RAM, 200GB SSD

Running Services

Service Node NodePort Domain Status
postgres:16 pve-worker (pinned) ClusterIP running
mariadb:11 adder-worker (pinned) ClusterIP running
ghost1 unpinned 32368 blog.the-fulfillment.org running
ghost2 unpinned 32369 blog.privacy-practice.com running
ghost3 unpinned 32370 blog.sjasoft.com running
forgejo:9 unpinned 32371 git.sjasoft.com running
authentik server unpinned 32372 auth.sjasoft.com running
authentik worker unpinned running

Remaining Services to Deploy

n8n, nats, vaultwarden, synapse, snikket, monerod

Next Steps

  • Add VirtualBox workstation VMs as workers to this cluster
  • Wire up remaining Ghost blogs in Caddy
  • Deploy remaining services from k3s/ manifests

Install Method

K3s was installed using /etc/rancher/k3s/config.yaml on each node (not INSTALL_K3S_EXEC env vars, which get lost in nested SSH). Binary was downloaded once to pve and distributed via scp. Use INSTALL_K3S_SKIP_DOWNLOAD=true when binary is pre-staged.