Samantha Atkins 759ef949bc K3s cluster on Proxmox with WireGuard mesh networking

Replaced Headscale (too buggy in 0.28.x — random node drops) with direct
WireGuard hub-and-spoke + full mesh. 7 Proxmox VMs across 3 hosts form a
K3s v1.34.6 cluster: 3 control-plane/etcd nodes, 4 workers.

Running services: postgres, mariadb, ghost (x3), forgejo, authentik.
All unpinned services use local-path StorageClass. Databases pinned to
pve-worker and adder-worker with local PVs.

Includes VM provisioning scripts (create-debian-template.sh, clone-vm.sh),
K3s manifests for all services, and full deployment docs in k3s/README.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-07 01:23:13 -04:00

7.6 KiB

Raw Blame History

K3s Cluster — Setup & Deployment Notes

This is the production cluster running on Proxmox VMs, connected via WireGuard hub-and-spoke. The VirtualBox learning cluster this replaced is retired.

WireGuard Mesh — Node Assignments

Hub: DO droplet at 138.197.87.251:51820, WG IP 10.0.0.1/24

Node	vmbr1 IP	WG IP	Proxmox Host
pve-control	10.10.10.151	10.0.0.6	pve
pve-worker	10.10.10.126	10.0.0.7	pve
adder-control	10.10.10.185	10.0.0.8	adder
adder-worker	10.10.10.83	10.0.0.9	adder
game-control	10.10.10.158	10.0.0.10	game
game-worker-hdd	10.10.10.186	10.0.0.11	game
game-worker-ssd	10.10.10.153	10.0.0.12	game

IPs 10.0.0.2–10.0.0.5 are reserved (old VirtualBox K3s nodes, leave alone).

All VMs are Debian Trixie on vmbr1 (10.10.10.0/24). Inter-node traffic runs over WireGuard (10.0.0.0/24).

K3s Install

Prerequisites — each VM must be on the WireGuard mesh first

WireGuard is configured via wg0.conf on each node (hub-and-spoke through DO droplet). Verify connectivity: ping 10.0.0.1 from the node.

First control plane node (cluster init)

curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--cluster-init --disable traefik \
  --node-ip=<10.0.0.x> --flannel-iface=wg0" sh -

# Get token for other nodes to join
sudo cat /var/lib/rancher/k3s/server/node-token

Second and third control plane nodes

curl -sfL https://get.k3s.io | K3S_URL=https://<control-1-mesh-ip>:6443 K3S_TOKEN=<token> \
  INSTALL_K3S_EXEC="--server https://<control-1-mesh-ip>:6443 --disable traefik \
  --node-ip=<this-node-mesh-ip> --flannel-iface=wg0" sh -

Note: use --server not just K3S_URL — this is what makes it a control plane peer, not a worker. etcd requires odd numbers — 3 control nodes tolerates 1 failure. Never stop at 2.

Workers

curl -sfL https://get.k3s.io | K3S_URL=https://<any-control-mesh-ip>:6443 K3S_TOKEN=<token> \
  INSTALL_K3S_EXEC="--node-ip=<this-node-mesh-ip> --flannel-iface=wg0" sh -

kubeconfig for normal user (on any control node)

mkdir -p ~/.kube
sudo cp /etc/rancher/k3s/k3s.yaml ~/.kube/config
sudo chown samantha:samantha ~/.kube/config
export KUBECONFIG=~/.kube/config   # also add to ~/.bashrc
# Update server IP in config if needed:
sed -i 's/127.0.0.1/<control-1-mesh-ip>/' ~/.kube/config

Label workers

kubectl label node <name> node-role.kubernetes.io/worker=worker

GPU Worker Nodes — adder and game

Both Proxmox hosts adder and game have RTX 2070 GPUs available for PCIe passthrough.

Proxmox PCIe passthrough setup (on each Proxmox host)

# Enable IOMMU in /etc/default/grub:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
# (use amd_iommu=on for AMD hosts)
update-grub
reboot

# Blacklist nvidia drivers on host so GPU is free for passthrough:
echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf
echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf
update-initramfs -u
reboot

In Proxmox UI: VM Hardware → Add → PCI Device → select the RTX 2070 → check "All Functions" and "Primary GPU" if it is the only GPU.

Inside the GPU worker VM — install NVIDIA drivers

apt-get install -y linux-headers-$(uname -r)
# Add non-free repo if needed:
apt-get install -y nvidia-driver firmware-misc-nonfree
reboot
# Verify:
nvidia-smi

Install NVIDIA device plugin in K3s

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.14.0/nvidia-device-plugin.yml

Label GPU nodes

kubectl label node k3s-adder nvidia.com/gpu=true
kubectl label node k3s-game nvidia.com/gpu=true

Verify GPU is schedulable

kubectl get nodes -o json | jq '.items[].status.capacity'
# Should show nvidia.com/gpu: "1" on adder and game

Scheduling a workload to a GPU node

resources:
  limits:
    nvidia.com/gpu: 1

Namespaces — one per venture

kubectl create namespace sjasoft
kubectl create namespace fulfillment
kubectl create namespace privacy-practice

Secrets are always created per namespace — never share secrets across namespaces.

Secrets

Never stored in files with real values. Always create directly on a control node.

# Pattern — adapt per service and namespace
kubectl create secret generic <name> \
  --namespace <namespace> \
  --from-literal=<key>='<value>'

# Generate passwords with:
openssl rand -base64 24

NodePort Registry

NodePorts must be unique across the entire cluster (range 30000-32767). Any NodePort is reachable on any node's WireGuard IP — K3s routes internally. Caddy on each venture ingress VPS proxies to any node's WG IP + NodePort.

Port	Service	Notes
32368	ghost1	blog.the-fulfillment.org
32369	ghost2	blog.privacy-practice.com
32370	ghost3	blog.sjasoft.com
32371	forgejo	git.sjasoft.com
32372	authentik (HTTP)	auth.sjasoft.com — use this behind Caddy
32373	authentik (HTTPS)	skip — Caddy handles TLS

Caddy Pattern — venture ingress VPS

Each venture has its own ingress VPS with its own public IP. Caddy on each proxies to a different node's mesh IP for the same cluster — ventures look unrelated from outside.

# Example — any node's WG IP works for any NodePort
blog.the-fulfillment.org {
    reverse_proxy 10.0.0.6:32368
}

git.sjasoft.com {
    reverse_proxy 10.0.0.8:32371
}

auth.sjasoft.com {
    reverse_proxy 10.0.0.10:32372
}

Pick any node's WG IP per service — they all work. Use different nodes per venture so ventures look unrelated from outside. See the WireGuard mesh table above for IPs.

Current Deployment Status (2026-04-07)

K3s v1.34.6 cluster fully operational. WireGuard full mesh (direct peer-to-peer over vmbr1, hub for external traffic). Headscale removed — too buggy (0.28.x dropped nodes randomly).

Cluster Nodes

Node	Role	WG IP	Proxmox Host	Resources
pve-control	control-plane, etcd	10.0.0.6	pve	2 CPU, 2GB RAM, 20GB
pve-worker	worker	10.0.0.7	pve	8 CPU, 58GB RAM, 3.3TB
adder-control	control-plane, etcd	10.0.0.8	adder	2 CPU, 2GB RAM, 20GB
adder-worker	worker	10.0.0.9	adder	10 CPU, 58GB RAM, 1.7TB
game-control	control-plane, etcd	10.0.0.10	game	2 CPU, 2GB RAM, 20GB
game-worker-hdd	worker	10.0.0.11	game	4 CPU, 6GB RAM, 1.4TB HDD
game-worker-ssd	worker	10.0.0.12	game	10 CPU, 8GB RAM, 200GB SSD

Running Services

Service	Node	NodePort	Domain	Status
postgres:16	pve-worker (pinned)	ClusterIP	—	running
mariadb:11	adder-worker (pinned)	ClusterIP	—	running
ghost1	unpinned	32368	blog.the-fulfillment.org	running
ghost2	unpinned	32369	blog.privacy-practice.com	running
ghost3	unpinned	32370	blog.sjasoft.com	running
forgejo:9	unpinned	32371	git.sjasoft.com	running
authentik server	unpinned	32372	auth.sjasoft.com	running
authentik worker	unpinned	—	—	running

Remaining Services to Deploy

n8n, nats, vaultwarden, synapse, snikket, monerod

Next Steps

Add VirtualBox workstation VMs as workers to this cluster
Wire up remaining Ghost blogs in Caddy
Deploy remaining services from k3s/ manifests

Install Method

K3s was installed using /etc/rancher/k3s/config.yaml on each node (not INSTALL_K3S_EXEC env vars, which get lost in nested SSH). Binary was downloaded once to pve and distributed via scp. Use INSTALL_K3S_SKIP_DOWNLOAD=true when binary is pre-staged.

7.6 KiB Raw Blame History Unescape Escape