Show HN: A 15-minute backup audit framework for self-hosted homelabs

Complete Proxmox Backup Audit in 15 Minutes | ScholarNet AI Blog

Free download times; The AI Study Planner Printable weekly planner. Pairs with the AI Tutor. No signup. Get the planner rarr;

Complete Proxmox Backup Audit in 15 Minutes

Homelab Backup Audit: The 15-Minute Checklist Most Self-Hosted Operators Skip

If your last verified restore was "sometime last year," you have silently-broken backups. I know because I've been there — staring at a terminal at 2am, realizing my "perfect" Proxmox backup job had been writing empty SQL files for six months. The pattern is universal across Proxmox, Docker, Unraid, and TrueNAS stacks: backups get configured, work for a while, then fail silently when something upstream changes (volume rename, container migration, disk swap). The audit below catches them in 15 minutes against your actual inventory.

⚡ Quick Summary

Run this 15-minute checklist monthly to verify backups actually work in Proxmox, Docker, Unraid, or TrueNAS—the free 5-prompt sample catches silent failures before data loss.

Ready to study smarter? Try ScholarNet AI free →

By Dale Weaver · Updated May 31, 2026. Tested against my own Proxmox + Docker + Unraid + TrueNAS homelab. The framework below is the basis of The Operator's Cockpit — a 40-prompt pack for solo homelab operators.

Why every long-running homelab eventually breaks backups silently

Backup software runs fine. Logs say success. Storage is healthy. Until you try to restore — and discover one of these:

The Postgres container migrated from VM-101 to LXC-205 six months ago, but the pg_dump script still points at the old hostname. Result: 6 months of backups are empty SQL files. I found this one the hard way during a "routine" recovery test. Spoiler: it wasn't routine.

Vaultwarden's bind-mount was renamed during a Docker compose refactor. Backups still run on the old path (which is now empty). The restore is technically valid — but you've been backing up nothing.

TrueNAS replication job to the offsite NAS is "working," but the offsite NAS ran out of space 90 days ago and silently rejects new snapshots. The local job logs success because it doesn't verify the remote side.

Plex metadata DB is on a separate Docker volume that was added after the backup script was written. It's never been backed up. You'll only find out the day Plex's transcoder corrupts the watch history — and trust me, losing your curated playlists hurts more than you'd think.

None of these surface in normal logs. The backup tool reports success because the tool's job did succeed — it just operated on the wrong target, the wrong path, or a broken remote. As a fellow operator once told me: "Your backup software isn't the problem. Your assumptions are." Catching this requires a state-level audit that compares what's running against what's being backed up.

The 5-step audit

Step 1 — Inventory your stateful services

Make a flat list. One service per row. Use this format:

name, type, host, last_backup_iso, last_verified_restore_iso

Postgres-prod, database, vm-101, 2026-05-28, 2026-03-15 Vaultwarden, password-mgr, vm-103, 2026-05-28, Caddy-config, reverse-proxy, vm-101, 2026-05-28, Plex-metadata, media-db, lxc-201, , Home-Assistant, home-automation, lxc-204, 2026-05-29, 2026-04-02 TrueNAS-replication, snapshot-job, nas-01, 2026-05-30, Nextcloud-data, file-storage, vm-105, 2026-05-29, 2026-02-08

The last_verified_restore_iso column is the one most operators leave blank. That's the column that catches the silent failures. When I first ran this audit on my own setup, I had four blank cells in that column. Embarrassing, but fixable.

Step 2 — Flag any service with no backup or no verified restore in 90+ days

Apply these rules in order:

RED FLAG: last_backup_iso is empty or older than 7 days. The service is unprotected.

YELLOW FLAG: last_backup_iso is recent but last_verified_restore_iso is empty or older than 90 days. The backup exists but you don't know if it works.

GREEN: recent backup AND verified restore within 90 days.

Most homelabs I've audited have 30-60% red or yellow flags after this single pass — without realizing it. My own was at 40%. Not great, but knowing is half the battle.

Step 3 — Cross-check the backup target paths against what's actually running

For each backed-up service, grep your backup config for the actual volume / bind-mount / database path. Compare against the live container or VM's mount table. Look for:

Path drift: backup config references /srv/postgres-data but the container actually mounts /srv/postgres-prod-data. The backup runs against an empty (or stale) directory.

Stopped service drift: container was stopped 3 months ago but backup config still includes it. Wastes storage and obscures the active set.

New service gaps: service was added in the last 90 days but never added to the backup config. Completely unprotected.

Step 4 — Test one restore. Just one.

Pick the service you'd most regret losing. Restore the most recent backup to...

Show HN: A 15-minute backup audit framework for self-hosted homelabs

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

It's Not Just X. It's Y

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy