Replaced Headscale (too buggy in 0.28.x — random node drops) with direct WireGuard hub-and-spoke + full mesh. 7 Proxmox VMs across 3 hosts form a K3s v1.34.6 cluster: 3 control-plane/etcd nodes, 4 workers. Running services: postgres, mariadb, ghost (x3), forgejo, authentik. All unpinned services use local-path StorageClass. Databases pinned to pve-worker and adder-worker with local PVs. Includes VM provisioning scripts (create-debian-template.sh, clone-vm.sh), K3s manifests for all services, and full deployment docs in k3s/README.md. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5.6 KiB
5.6 KiB
K3s Session State
Saved: 2026-04-06 (end of session 3)
Current State
New Proxmox-based K3s cluster in progress. VirtualBox cluster retired. All 7 Proxmox VMs created and on WireGuard mesh. K3s not yet installed. Old VirtualBox services (ghost, forgejo, postgres, mariadb) still running on old cluster until migration complete.
Proxmox VMs
| Node | vmbr1 IP | WG IP | Proxmox Host | Role |
|---|---|---|---|---|
| pve-control | 10.10.10.151 | 10.0.0.6 | pve | k3s control plane |
| pve-worker | 10.10.10.126 | 10.0.0.7 | pve | k3s worker |
| adder-control | 10.10.10.185 | 10.0.0.8 | adder | k3s control plane |
| adder-worker | 10.10.10.83 | 10.0.0.9 | adder | k3s worker |
| game-control | 10.10.10.158 | 10.0.0.10 | game | k3s control plane |
| game-worker-hdd | 10.10.10.186 | 10.0.0.11 | game | k3s worker (local-lvm/HDD) |
| game-worker-ssd | 10.10.10.153 | 10.0.0.12 | game | k3s worker (game-ssd/NVMe) |
WG IPs 10.0.0.2–10.0.0.5 reserved (old VirtualBox nodes, do not reuse). Hub: DO droplet at 138.197.87.251:51820, WG IP 10.0.0.1
VM Specs
| Node | vCPUs | RAM | Disk | Storage |
|---|---|---|---|---|
| pve-control | 2 | 2GB | 20G | local-lvm |
| pve-worker | 6 | 8GB | 100G | local-lvm |
| adder-control | 2 | 2GB | 20G | local-lvm |
| adder-worker | 6 | 8GB | 100G | local-lvm |
| game-control | 2 | 2GB | 20G | local-lvm |
| game-worker-hdd | 6 | 8GB | 200G | local-lvm (HDD) |
| game-worker-ssd | 10 | 8GB | 200G | game-ssd (NVMe) |
Network Architecture
- All VMs on vmbr1 (10.10.10.0/24), DHCP
- WireGuard mesh via DO hub — all nodes have static WG IPs (10.0.0.0/24)
- Full mesh: all nodes have each other as explicit WireGuard peers (not just hub-and-spoke)
- K3s will use --flannel-iface=wg0 so all cluster traffic runs over WireGuard
- Caddy at DO hub proxies external traffic to any node's WG IP + NodePort
- Tailscale/Headscale abandoned — too unreliable for cluster networking
Proxmox Host Specs
- pve: workstation i9-13900KF, 96GB RAM
- adder: Proxmox node with RTX 2070, 4TB NVMe available
- game: Proxmox node with RTX 2070, 16GB RAM, 256GB NVMe (game-ssd) + 2TB HDD (local-lvm)
VM Provisioning
Template & Clone Scripts
Scripts at ~/private/Knowledge/repos/homelab/proxmox/scripts/:
create-debian-template.sh <VMID> <NAME> [STORAGE] [BRIDGE]- Defaults: STORAGE=local-lvm, BRIDGE=vmbr1
- Bakes in: qemu-guest-agent, curl, wget, nano, rsync, htop, tmux, emacs-nox, nfs-common, tailscale
- Zeroes /etc/machine-id, removes /etc/ssh/ssh_host_* (Cloud-Init regenerates on first boot)
- Does NOT create .ssh or set keys — done post-boot via qm set
clone-vm.sh <TEMPLATE_VMID> <NEW_VMID> <NAME> [CORES] [MEMORY_MB] [DISK_SIZE] [STORAGE]- Defaults: 2 cores, 2048MB RAM, 20G disk, local-lvm storage
- Full clone, auto-starts the VM
Post-Clone Formula (confirmed working)
- Clone:
./clone-vm.sh <template> <vmid> <name> [cores] [mem] [disk] [storage] - Get IP:
qm guest cmd <vmid> network-get-interfaces - Set SSH key:
qm set <vmid> --sshkeys <pubkey-file> - Reboot VM:
qm reboot <vmid> - SSH in:
ssh samantha@<ip> - Configure WireGuard on the VM
VMID Convention
- pve: 100-199 (templates at 199)
- adder: 200-299 (templates at 299 — currently 200 exists, destroy after use)
- game: 300-399 (templates at 399 — currently 300 exists, destroy after use)
Useful Proxmox CLI
qm guest cmd <VMID> network-get-interfaces— get VM IPqm set <VMID> --vga std --delete serial0— fix serial consoleqm destroy <VMID> --purge— remove VMqm list— list all VMsvgs— check local-lvm free spacepvesh get /nodes/<nodename>/status— CPU/memory usage
Immediate Next Steps
- Install K3s on pve-control first (--cluster-init)
- Join adder-control and game-control as control plane peers
- Join all 4 workers
- Label workers and GPU nodes
- Create namespaces: sjasoft, fulfillment, privacy-practice
- Migrate services from old VirtualBox cluster
K3s Install — see k3s/README.md for full commands
- Control plane uses --cluster-init on first node, --server on subsequent nodes
- All nodes use --flannel-iface=wg0 and --node-ip=
- Traefik disabled on all nodes
- 3 control plane nodes for HA etcd (tolerates 1 failure)
Running Services (old VirtualBox cluster — not yet migrated)
- postgres:16 — ClusterIP:5432
- mariadb:11 — ClusterIP:3306
- ghost1/2/3 — NodePorts 32368/32369/32370
- forgejo:9 — NodePort 32371, git.sjasoft.com
NodePort Registry
| Port | Service | Namespace |
|---|---|---|
| 32368 | ghost1 | fulfillment |
| 32369 | ghost2 | fulfillment |
| 32370 | ghost3 | fulfillment |
| 32371 | forgejo | sjasoft |
Manifests
All in Knowledge/repos/homelab/k3s/:
- k3s/postgres/postgres.yaml
- k3s/mariadb/mariadb.yaml
- k3s/ghost/ghost.yaml
- k3s/forgejo/forgejo.yaml
- k3s/README.md (authoritative WG mesh table + K3s install commands)
Remaining Services to Port (from Proxmox Docker stack)
- authentik.yml — SSO (postgres)
- n8n.yml — automation (postgres)
- vaultwarden.yml — passwords
- nats.yml — messaging
- monerod.yml — monero node
- snikket.yml — XMPP
- synapse.yml — Matrix
Known Issues / Notes
- Tailscale/Headscale abandoned — unreliable, randomly drops nodes, requires manual reconnect
- WireGuard full mesh is the correct approach for K3s cluster networking
- kubectl requires KUBECONFIG=~/.kube/config in ~/.bashrc on control nodes
- Cross-namespace secrets not supported — keep secrets in same namespace as consumer
- game node only has 16GB RAM — allocate worker VMs conservatively
- game-ssd is only 256GB NVMe — keep disk allocations conservative on game-worker-ssd
- Templates should be destroyed after all clones are complete on each node