Getting Started
Getting Started with OpsKern
You have a Linux host. You have Ansible. In about 30 minutes, you will have a self-healing homelab — monitoring deployed, alert rules active, and your first automated remediation running.
Prerequisites
You need the following on your control node (the machine that runs Ansible):
- Linux host — Ubuntu 22.04+, Debian 12+, or Fedora 38+ (any systemd-based distro works)
- Ansible — version 2.14 or newer
- Python 3.10+ — for the operations agent
- SSH access — key-based authentication to your managed hosts
- Git — to clone the repo
On your managed hosts (the servers Ansible will configure):
- Linux — same distro requirements as above
- SSH server — running and reachable from the control node
- sudo access — for the Ansible user
Optional but recommended:
- Tailscale — for secure, zero-config VPN between hosts
- A second host — to see fleet-wide monitoring in action (a VM or LXC container works fine)
Step 1: Clone the repo
git clone https://github.com/opskern/ops-kernel-stack.git
cd ops-kernel-stack
Step 2: Configure your inventory
Copy the example inventory and add your hosts:
cp inventory/example.yml inventory/hosts.yml
Edit inventory/hosts.yml with your hostnames and IP addresses. At minimum, you need one host under the monitoring group.
Step 3: Deploy monitoring
This single command deploys Prometheus, Grafana, Loki, and node_exporter across your fleet:
ansible-playbook playbooks/site.yml -l monitoring
After this completes, open Grafana at http://<monitoring-host>:3000. Default credentials are in the README.
Step 4: Add your first alert rule
The collection ships with 108 alert rules. To deploy them:
ansible-playbook playbooks/alerting.yml
Alertmanager will start routing alerts based on the default configuration. Edit group_vars/all/alertmanager.yml to point notifications at your preferred channel (email, Slack, ntfy, etc.).
Step 5: Run your first remediation
The simplest remediation to test is disk cleanup. Trigger it manually to see the pipeline in action:
ansible-playbook playbooks/remediate-disk-cleanup.yml -l <your-host>
To enable automated remediation (Alertmanager triggers playbooks via the operations agent), follow the operations agent setup in the repo README.
What’s next
- Ansible Homelab Cheat Sheet — 20 commands and patterns on one page
- The book — 12 chapters covering the full stack, from Proxmox provisioning to the operations agent
- ops-kernel-stack on GitHub — the full source, issues, and discussions
- Blog — deep dives on specific patterns and architecture decisions
Questions? Email hello@opskern.io or open an issue on GitHub.