diff --git a/K3s-SESSION-STATE.md b/K3s-SESSION-STATE.md new file mode 100644 index 0000000..8450a79 --- /dev/null +++ b/K3s-SESSION-STATE.md @@ -0,0 +1,142 @@ +# K3s Session State +# Saved: 2026-04-06 (end of session 3) + +## Current State + +New Proxmox-based K3s cluster in progress. VirtualBox cluster retired. +All 7 Proxmox VMs created and on WireGuard mesh. K3s not yet installed. +Old VirtualBox services (ghost, forgejo, postgres, mariadb) still running on old cluster until migration complete. + +## Proxmox VMs + +| Node | vmbr1 IP | WG IP | Proxmox Host | Role | +|---|---|---|---|---| +| pve-control | 10.10.10.151 | 10.0.0.6 | pve | k3s control plane | +| pve-worker | 10.10.10.126 | 10.0.0.7 | pve | k3s worker | +| adder-control | 10.10.10.185 | 10.0.0.8 | adder | k3s control plane | +| adder-worker | 10.10.10.83 | 10.0.0.9 | adder | k3s worker | +| game-control | 10.10.10.158 | 10.0.0.10 | game | k3s control plane | +| game-worker-hdd | 10.10.10.186 | 10.0.0.11 | game | k3s worker (local-lvm/HDD) | +| game-worker-ssd | 10.10.10.153 | 10.0.0.12 | game | k3s worker (game-ssd/NVMe) | + +WG IPs 10.0.0.2–10.0.0.5 reserved (old VirtualBox nodes, do not reuse). +Hub: DO droplet at 138.197.87.251:51820, WG IP 10.0.0.1 + +## VM Specs + +| Node | vCPUs | RAM | Disk | Storage | +|---|---|---|---|---| +| pve-control | 2 | 2GB | 20G | local-lvm | +| pve-worker | 6 | 8GB | 100G | local-lvm | +| adder-control | 2 | 2GB | 20G | local-lvm | +| adder-worker | 6 | 8GB | 100G | local-lvm | +| game-control | 2 | 2GB | 20G | local-lvm | +| game-worker-hdd | 6 | 8GB | 200G | local-lvm (HDD) | +| game-worker-ssd | 10 | 8GB | 200G | game-ssd (NVMe) | + +## Network Architecture + +- All VMs on vmbr1 (10.10.10.0/24), DHCP +- WireGuard mesh via DO hub — all nodes have static WG IPs (10.0.0.0/24) +- Full mesh: all nodes have each other as explicit WireGuard peers (not just hub-and-spoke) +- K3s will use --flannel-iface=wg0 so all cluster traffic runs over WireGuard +- Caddy at DO hub proxies external traffic to any node's WG IP + NodePort +- Tailscale/Headscale abandoned — too unreliable for cluster networking + +## Proxmox Host Specs + +- pve: workstation i9-13900KF, 96GB RAM +- adder: Proxmox node with RTX 2070, 4TB NVMe available +- game: Proxmox node with RTX 2070, 16GB RAM, 256GB NVMe (game-ssd) + 2TB HDD (local-lvm) + +## VM Provisioning + +### Template & Clone Scripts +Scripts at `~/private/Knowledge/repos/homelab/proxmox/scripts/`: +- `create-debian-template.sh [STORAGE] [BRIDGE]` + - Defaults: STORAGE=local-lvm, BRIDGE=vmbr1 + - Bakes in: qemu-guest-agent, curl, wget, nano, rsync, htop, tmux, emacs-nox, nfs-common, tailscale + - Zeroes /etc/machine-id, removes /etc/ssh/ssh_host_* (Cloud-Init regenerates on first boot) + - Does NOT create .ssh or set keys — done post-boot via qm set +- `clone-vm.sh [CORES] [MEMORY_MB] [DISK_SIZE] [STORAGE]` + - Defaults: 2 cores, 2048MB RAM, 20G disk, local-lvm storage + - Full clone, auto-starts the VM + +### Post-Clone Formula (confirmed working) +1. Clone: `./clone-vm.sh