HydraGuard Runbook

WireGuard mesh connecting venues, air units, and cloud infrastructure to the Hydra platform.

Infrastructure

Resource	Value
Hub server	`89.167.57.232` (Hetzner cx23, Falkenstein)
Hub DNS	`hydraguard.experiencenet.com`
Hub WG address	`10.10.0.1/24`
Hub public key	`VGA6ETZB2XFVRRb5KmcFvQ+Ybfh9KKfcWuXfP1IuvQE=`
WireGuard port	`51820/udp`
Mesh file	`/root/.hydraguard/mesh.yaml`
Hub private key	`/etc/wireguard/hub.key`
WG config (generated)	`/etc/wireguard/wg0.conf`
API server	`http://hydraguard.experiencenet.com:8081`
API config	`/root/.hydraguard/api.yaml`
Requests store	`/root/.hydraguard/requests.yaml`
Audit log	`/root/.hydraguard/audit.log`
Service	`systemctl status hydraguard`
Logs	`journalctl -u hydraguard -f`

Current Mesh

Peer	Type	WG Address	LAN	Guard	Status
AD6	venue	10.10.1.1/32	10.0.0.0/24	omada	Online
air-001	air	10.10.100.1/32	--	--	Offline
air-tvl-one	air	10.10.100.2/32	--	--	Offline
air-cederiks24	air	10.10.100.3/32	--	--	Offline
air-hydraneckwebrtc	air	10.10.100.4/32	--	--	Online

Address Scheme

Type	WG tunnel range	LAN range	Capacity
Hub	10.10.0.1/24	--	1
Venues	10.10.1-49.1/32	10.0.X.0/24 (auto or custom)	49
Neck Air	10.10.50-99.1/32	10.0.X.0/24	50
Hydra Air	10.10.100.1-254/32	-- (no LAN)	254

SSH Access

ssh root@89.167.57.232

Health Check

curl -s http://hydraguard.experiencenet.com:8081/api/v1/health

Operations

All commands run on the hub server unless stated otherwise.

Check status

hydraguard status

Example output:

PEER               TYPE             ADDRESS         HANDSHAKE       TRANSFER
AD6                venue/omada      10.10.1.1       12s ago         4.86 KiB / 3.78 KiB
air-hydraneckwebrtc air              10.10.100.4     53s ago         4.41 KiB / 2.48 KiB
air-001            air              10.10.100.1     -- (offline)    0 / 0

Handshake "X ago" = peer is online and connected
"-- (offline)" = no recent handshake, peer is unreachable

Raw WireGuard status

wg show wg0

Shows endpoints, allowed IPs, transfer bytes, and last handshake per peer.

View logs

journalctl -u hydraguard -f              # Follow live
journalctl -u hydraguard -n 100 --no-pager  # Last 100 lines

Restart

systemctl restart hydraguard

Update

hydraguard check-update    # Check if a new version is available
hydraguard update          # Download and install the latest version

Never manually deploy. Always use the release pipeline (tag + push to trigger CI).

Adding Peers

Every add command:

Generates a WireGuard keypair
Stores the public key in mesh.yaml
Prints the private key to stdout (save it, it is only shown once)
Auto-assigns the next available address

After adding any peer, always run hydraguard apply.

Add a Venue

hydraguard venue add <name> --location <city> --guard <omada|citymesh|linuxvm|gateway> [--lan <cidr>]
hydraguard apply

Guard types:

Guard type	Use case	Notes
`omada`	TP-Link Omada ER605/ER7212	Configured via Omada SDN Controller API
`citymesh`	Citymesh Guard (Mikrotik)	Bare WireGuard config
`linuxvm`	Linux VM gateway (Azure/GCP/AWS)	Adds PostUp for IP forwarding and masquerade
`gateway`	On-prem LAN gateway (behind FortiGate)	iptables `-I FORWARD 1` (priority), masquerade, MSS clamping

Add a Hydra Air Unit

Standalone render nodes with WireGuard running directly on Windows.

hydraguard air add <id>
hydraguard apply
hydraguard air config <id>    # Get Windows .conf

Add a Neck Air Unit

Mobile venue-in-a-box setups with a Mikrotik router.

hydraguard neckair add <id>
hydraguard apply
hydraguard neckair config <id>    # Get Mikrotik .conf

Get a peer config

hydraguard venue config <name>
hydraguard air config <id>
hydraguard neckair config <id>

Removing peers

hydraguard venue remove <name> && hydraguard apply
hydraguard air remove <id> && hydraguard apply
hydraguard neckair remove <id> && hydraguard apply

The peer is instantly unreachable after apply.

Applying Changes

hydraguard apply

This regenerates /etc/wireguard/wg0.conf and runs wg syncconf to hot-reload. Existing connections are not disrupted. If wg0 is not up, it runs wg-quick up wg0 instead.

Full restart (when syncconf is not enough)

wg-quick down wg0
wg-quick up wg0

After a full restart, peers behind NAT need up to 25 seconds to re-establish their handshake (PersistentKeepalive interval).

Self-Registration API

Peers can register themselves via the HTTP API instead of requiring SSH access.

Workflow

Client generates a WireGuard keypair locally
Client submits public key via POST /api/v1/register (requires API bearer token)
Request appears as "pending" in requests.yaml
Admin reviews and approves via CLI
Client polls for approval, then fetches its WireGuard config

Managing requests

hydraguard requests list              # Show pending
hydraguard requests list --all        # Show all (including approved/denied)
hydraguard requests approve <id>      # Approve, adds peer to mesh
hydraguard requests deny <id>
hydraguard requests delete <id>

When --auto-apply is enabled, the hub config is automatically updated after approval.

Backup

The only critical file is mesh.yaml. Back it up:

cp ~/.hydraguard/mesh.yaml ~/.hydraguard/mesh.yaml.bak
scp root@89.167.57.232:~/.hydraguard/mesh.yaml ./mesh-backup-$(date +%Y%m%d).yaml

The private key at /etc/wireguard/hub.key should also be backed up securely. If lost, you need to regenerate it and update all peer configs with the new public key.

hydrabackup also backs up the mesh.yaml to hydramirror automatically.

Troubleshooting

Peer shows "offline" / no handshake

Check firewall on hub: ufw status -- port 51820/udp must be open
Check peer's internet: Can the peer reach the internet?
Verify keys match: The peer's config must have the hub's public key, and the hub's mesh.yaml must have the peer's public key
Check PersistentKeepalive: Must be 25 in peer configs (HydraGuard sets this automatically)
Check endpoint: Peer config should have Endpoint = hydraguard.experiencenet.com:51820

Handshake works but no data flows

This happens when the WireGuard tunnel negotiates successfully but actual traffic (pings, connections) does not pass through. Common causes:

UFW blocking FORWARD chain on hub. The wg-quick PostUp rule must insert (not append) the FORWARD rule before UFW's default DROP:
```
# Check current FORWARD chain
iptables -L FORWARD -n | head -5
# If the wg0 ACCEPT rule is after ufw-reject-forward, fix it:
iptables -I FORWARD 1 -i wg0 -o wg0 -j ACCEPT
```
The generated wg0.conf uses iptables -I FORWARD 1 to avoid this. If you see -A FORWARD in the conf, update it.
Peer behind NAT took too long to re-handshake. After a hub wg-quick down/up, peers behind NAT must re-initiate. Wait 25 seconds for the keepalive. Check:
```
wg show wg0 | grep -A5 "endpoint"
```
If the peer has an endpoint but "latest handshake" is blank, the peer hasn't sent a keepalive yet.
Routing table missing. After wg-quick down/up, verify routes exist:
```
ip route show dev wg0
```
Should show routes for each peer's AllowedIPs. hydraguard apply automatically syncs kernel routes after wg syncconf, but if routes are still missing, run:
```
wg-quick down wg0 && wg-quick up wg0
```

Can't reach a venue's LAN devices

Test connectivity step by step:

ping 10.10.X.1    # 1. VPN box tunnel address (WG layer)
ping 10.0.X.1     # 2. VPN box LAN gateway (routing through VPN box)
ping 10.0.X.100   # 3. A device on the LAN

If step 1 works but step 2/3 fails:

The VPN box (ER605/Mikrotik) is not forwarding traffic between WG and LAN
Check firewall rules on the VPN box
On Omada: check via the Omada SDN Controller (see omada-venue.md)

If step 1 fails:

Check hydraguard status for handshake
Verify the LAN CIDR in mesh.yaml matches the actual venue LAN (common mistake: 10.0.1.0/24 vs 10.0.0.0/24)

Bodies (Windows render nodes) unreachable via ping but online

Windows Firewall blocks ICMP by default. The bodies may be online and functional even if ping fails. Verify by:

Checking the Omada controller's client list (shows MAC, connection status)
Checking hydraneck's scan results
Trying to connect to a known service port on the body

Inter-peer traffic not forwarding (e.g., hydraneckwebrtc cannot reach venue LAN)

Traffic between two WG peers (e.g., hydraneckwebrtc at 10.10.100.4 reaching AD6 LAN at 10.0.0.0/24) must be forwarded by the hub. Check:

IP forwarding enabled: cat /proc/sys/net/ipv4/ip_forward (must be 1)
iptables FORWARD rule: iptables -L FORWARD -n | head -3 -- the ACCEPT rule for wg0 must be before any DROP/REJECT
Both peers connected: Both the source peer and the destination venue must have active handshakes

WireGuard interface won't come up

ip link show wg0           # Check if interface exists
wg-quick strip wg0         # Check config syntax
journalctl -u wg-quick@wg0 # Check logs

DNS not resolving

dig +short hydraguard.experiencenet.com @8.8.8.8

If DNS doesn't resolve, check the A record in Hetzner DNS (zone 788422).

mesh.yaml out of sync with wg0.conf

hydraguard apply    # Regenerates wg0.conf from mesh.yaml and syncs

Full reset

wg-quick down wg0
rm /etc/wireguard/wg0.conf
hydraguard apply

Releasing

git tag v1.1.0
git push origin v1.1.0

This triggers CI which builds binaries for linux/darwin x amd64/arm64 and publishes them as a GitHub Release. The hub picks up new versions via hydraguard update.

HydraPipeline