Self-Healing Infrastructure — The Book
The Book
Self-Healing Infrastructure: Building an Autonomous Homelab with Ansible
This is not a tutorial that stops at “hello world.” This is the whole stack — from bare-metal Proxmox provisioning to the operations agent that makes your infrastructure fix itself. Every decision explained. Every tradeoff documented.

What you’ll build
- The operations agent — remediation, vulnerability scanning, config drift detection
- Fleet monitoring — Prometheus + Grafana + Loki deployed across your fleet with a single Ansible run
- Automated backups — Restic snapshots with daily verification
- Reverse proxy — Caddy with automatic TLS
- DNS — BIND9 managed by Ansible
- Idempotent roles — tested with Molecule
12 chapters, real configs
Every chapter uses real YAML, real playbooks, and real alert rules pulled from a production homelab running 13 hosts and 34 containers. No toy examples.
| Chapter | Topic |
|---|---|
| 1–2 | Proxmox provisioning and base OS hardening |
| 3–4 | Ansible roles, inventory, and vault |
| 5–6 | Prometheus, Grafana, Loki — fleet monitoring |
| 7–8 | Alertmanager, 94 alert rules, notification routing |
| 9–10 | The operations agent — remediation bridge, tier classification, approval gates |
| 11 | Backups — Restic, B2 offsite, daily verification |
| 12 | Putting it all together — from alert to resolution in 47 seconds |
Get the book
196 pages. $19 minimum, $29 suggested. Available now on Leanpub.
Free sample
Not sure yet? Get the free getting started guide — a quick-start walkthrough that covers prerequisites, cloning the repo, and running your first automated remediation. It covers the same tools and patterns used in the book.
You can also grab the Ansible Homelab Cheat Sheet — 20 commands and patterns on one page, no signup required.