Updates mail__options__host to smtp.mailgun.org and auth keys to mailgun-smtp-user/mailgun-smtp-password in ghost-secrets. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| authentik | ||
| forgejo | ||
| garage | ||
| ghost | ||
| listmonk | ||
| mariadb | ||
| mattermost | ||
| mediawiki | ||
| monerod | ||
| n8n | ||
| nats | ||
| postgres | ||
| redis | ||
| resilience | ||
| scripts | ||
| snikket | ||
| storage | ||
| synapse | ||
| vaultwarden | ||
| README.md | ||
K3s Cluster — Setup & Deployment Notes
This is the production cluster running on Proxmox VMs, connected via WireGuard hub-and-spoke. The VirtualBox learning cluster this replaced is retired.
WireGuard Mesh — Node Assignments
Hub: DO droplet at 138.197.87.251:51820, WG IP 10.0.0.1/24
| Node | vmbr1 IP | WG IP | Proxmox Host |
|---|---|---|---|
| pve-control | 10.10.10.151 | 10.0.0.6 | pve |
| pve-worker | 10.10.10.126 | 10.0.0.7 | pve |
| adder-control | 10.10.10.185 | 10.0.0.8 | adder |
| adder-worker | 10.10.10.83 | 10.0.0.9 | adder |
| game-control | 10.10.10.158 | 10.0.0.10 | game |
| game-worker-hdd | 10.10.10.186 | 10.0.0.11 | game |
| game-worker-ssd | 10.10.10.153 | 10.0.0.12 | game |
| fat_mama | 192.168.40.220 | 10.0.0.13 | workstation (VBox, bridged LAN) |
IPs 10.0.0.2–10.0.0.5 are reserved (old VirtualBox K3s nodes, leave alone).
All VMs are Debian Trixie on vmbr1 (10.10.10.0/24). Inter-node traffic runs over WireGuard (10.0.0.0/24).
K3s Install
Prerequisites — each VM must be on the WireGuard mesh first
WireGuard is configured via wg0.conf on each node (hub-and-spoke through DO droplet).
Verify connectivity: ping 10.0.0.1 from the node.
First control plane node (cluster init)
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--cluster-init --disable traefik \
--node-ip=<10.0.0.x> --flannel-iface=wg0" sh -
# Get token for other nodes to join
sudo cat /var/lib/rancher/k3s/server/node-token
Second and third control plane nodes
curl -sfL https://get.k3s.io | K3S_URL=https://<control-1-mesh-ip>:6443 K3S_TOKEN=<token> \
INSTALL_K3S_EXEC="--server https://<control-1-mesh-ip>:6443 --disable traefik \
--node-ip=<this-node-mesh-ip> --flannel-iface=wg0" sh -
Note: use --server not just K3S_URL — this is what makes it a control plane peer, not a worker.
etcd requires odd numbers — 3 control nodes tolerates 1 failure. Never stop at 2.
Workers
curl -sfL https://get.k3s.io | K3S_URL=https://<any-control-mesh-ip>:6443 K3S_TOKEN=<token> \
INSTALL_K3S_EXEC="--node-ip=<this-node-mesh-ip> --flannel-iface=wg0" sh -
kubeconfig for normal user (on any control node)
mkdir -p ~/.kube
sudo cp /etc/rancher/k3s/k3s.yaml ~/.kube/config
sudo chown samantha:samantha ~/.kube/config
export KUBECONFIG=~/.kube/config # also add to ~/.bashrc
# Update server IP in config if needed:
sed -i 's/127.0.0.1/<control-1-mesh-ip>/' ~/.kube/config
Label workers
kubectl label node <name> node-role.kubernetes.io/worker=worker
GPU Worker Nodes — adder and game
Both Proxmox hosts adder and game have RTX 2070 GPUs available for PCIe passthrough.
Proxmox PCIe passthrough setup (on each Proxmox host)
# Enable IOMMU in /etc/default/grub:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
# (use amd_iommu=on for AMD hosts)
update-grub
reboot
# Blacklist nvidia drivers on host so GPU is free for passthrough:
echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf
echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf
update-initramfs -u
reboot
In Proxmox UI: VM Hardware → Add → PCI Device → select the RTX 2070 → check "All Functions" and "Primary GPU" if it is the only GPU.
Inside the GPU worker VM — install NVIDIA drivers
apt-get install -y linux-headers-$(uname -r)
# Add non-free repo if needed:
apt-get install -y nvidia-driver firmware-misc-nonfree
reboot
# Verify:
nvidia-smi
Install NVIDIA device plugin in K3s
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.14.0/nvidia-device-plugin.yml
Label GPU nodes
kubectl label node k3s-adder nvidia.com/gpu=true
kubectl label node k3s-game nvidia.com/gpu=true
Verify GPU is schedulable
kubectl get nodes -o json | jq '.items[].status.capacity'
# Should show nvidia.com/gpu: "1" on adder and game
Scheduling a workload to a GPU node
resources:
limits:
nvidia.com/gpu: 1
Namespaces — one per venture
kubectl create namespace sjasoft
kubectl create namespace fulfillment
kubectl create namespace privacy-practice
Secrets are always created per namespace — never share secrets across namespaces.
Secrets
Never stored in files with real values. Always create directly on a control node.
# Pattern — adapt per service and namespace
kubectl create secret generic <name> \
--namespace <namespace> \
--from-literal=<key>='<value>'
# Generate passwords with:
openssl rand -base64 24
NodePort Registry
NodePorts must be unique across the entire cluster (range 30000-32767). Any NodePort is reachable on any node's WireGuard IP — K3s routes internally. Caddy on each venture ingress VPS proxies to any node's WG IP + NodePort.
| Port | Service | Notes |
|---|---|---|
| 32368 | ghost1 | blog.the-fulfillment.org |
| 32369 | ghost2 | blog.privacy-practice.com |
| 32370 | ghost3 | blog.sjasoft.com |
| 32371 | forgejo | git.sjasoft.com |
| 32372 | authentik (HTTP) | auth.sjasoft.com — use this behind Caddy |
| 32373 | authentik (HTTPS) | skip — Caddy handles TLS |
| 32374 | mattermost | planned |
| 32375 | listmonk | deployed |
| 32376 | n8n | deployed |
| 32377 | vaultwarden | planned |
| 32379 | monerod (RPC) | planned |
| 32380 | monerod (P2P) | planned |
| 32381 | snikket (HTTP) | planned |
| 32382 | snikket (C2S) | planned |
| 32383 | snikket (S2S) | planned |
| 32384 | snikket (proxy65) | planned |
| 32385 | synapse | planned |
| 32386 | nats (client) | planned |
| 32387 | nats (websocket) | planned |
| 32388 | nats (monitoring) | planned |
| 32389 | nats (leafnode) | planned |
| 32390 | garage (S3 API) | deployed |
| 32391 | garage-webui | deployed |
| 32392 | mediawiki | deployed |
Caddy Pattern — venture ingress VPS
Each venture has its own ingress VPS with its own public IP. Caddy on each proxies to a different node's mesh IP for the same cluster — ventures look unrelated from outside.
# Example — any node's WG IP works for any NodePort
blog.the-fulfillment.org {
reverse_proxy 10.0.0.6:32368
}
git.sjasoft.com {
reverse_proxy 10.0.0.8:32371
}
auth.sjasoft.com {
reverse_proxy 10.0.0.10:32372
}
Pick any node's WG IP per service — they all work. Use different nodes per venture so ventures look unrelated from outside. See the WireGuard mesh table above for IPs.
Current Deployment Status (2026-04-16)
K3s v1.34.6 cluster fully operational. WireGuard full mesh (direct peer-to-peer over vmbr1, hub for external traffic). Headscale removed — too buggy (0.28.x dropped nodes randomly).
Cluster Nodes
| Node | Role | WG IP | Proxmox Host | Resources |
|---|---|---|---|---|
| pve-control | control-plane, etcd | 10.0.0.6 | pve | 2 CPU, 2GB RAM, 20GB |
| pve-worker | worker | 10.0.0.7 | pve | 8 CPU, 58GB RAM, 3.3TB |
| adder-control | control-plane, etcd | 10.0.0.8 | adder | 2 CPU, 2GB RAM, 20GB |
| adder-worker | worker | 10.0.0.9 | adder | 10 CPU, 58GB RAM, 1.7TB |
| game-control | control-plane, etcd | 10.0.0.10 | game | 2 CPU, 2GB RAM, 20GB |
| game-worker-hdd | worker | 10.0.0.11 | game | 4 CPU, 6GB RAM, 1.4TB HDD |
| game-worker-ssd | worker | 10.0.0.12 | game | 10 CPU, 8GB RAM, 200GB SSD |
| fat_mama | worker | 10.0.0.13 | workstation (VBox) | 20 CPU, 21GB RAM, 200GB |
Running Services
Scheduler-assigned node in parens reflects current placement (unpinned services may
move on restart). Pinned services have nodeName in their manifest.
| Service | Node | NodePort | Domain | Status |
|---|---|---|---|---|
| postgres:16 | pve-worker (pinned) | ClusterIP | — | running |
| mariadb:11 | adder-worker (pinned) | ClusterIP | — | running |
| ghost1 | unpinned (game-worker-ssd) | 32368 | blog.the-fulfillment.org | running |
| ghost2 | unpinned (pve-worker) | 32369 | blog.privacy-practice.com | running |
| ghost3 | unpinned (adder-worker) | 32370 | blog.sjasoft.com | running |
| forgejo:9 | unpinned (pve-worker) | 32371 | git.sjasoft.com | running |
| authentik server | unpinned (adder-worker) | 32372 | auth.sjasoft.com | running |
| authentik worker | unpinned (adder-worker) | — | — | running |
| listmonk | unpinned (pve-worker) | 32375 | — | running |
| n8n | unpinned (game-worker-ssd) | 32376 | — | running |
| vaultwarden | unpinned (game-worker-hdd) | 32377 | — | running |
| mattermost:10.11 | unpinned (fatmama) | 32374 | chat.the-fulfillment.org | running |
| nats (w/ leafnode + JetStream) | unpinned (fatmama) | 32386-32389 | — | running |
| redis:7 (8GB, AOF) | fatmama (pinned — kernel tuning) | ClusterIP | — | running |
| garage:v2.3 (S3 API + admin + web) | unpinned (adder-worker) w/ anti-affinity on game-worker-hdd | 32390 (S3), 31540 (web), 3903 ClusterIP (admin) | s3.sjasoft.com, page.sjasoft.com | running |
| garage-webui | unpinned (pve-worker) | 32391 | s3-web.sjasoft.com | running |
| mediawiki:1.43 | unpinned (pve-worker) | 32392 | wiki.the-fulfillment.org | running |
| nfs-subdir-external-provisioner | kube-system | — | — | running (StorageClass nas-nfs on /volume1/samantha-private) |
Remaining Services to Deploy
synapse, snikket, monerod, plane (blocked on design decisions)
Next Steps
- Add VirtualBox workstation VMs as workers to this cluster
- Wire up remaining Ghost blogs in Caddy
- Deploy remaining services from k3s/ manifests
Install Method
K3s was installed using /etc/rancher/k3s/config.yaml on each node (not INSTALL_K3S_EXEC env vars,
which get lost in nested SSH). Binary was downloaded once to pve and distributed via scp.
Use INSTALL_K3S_SKIP_DOWNLOAD=true when binary is pre-staged.