I Built a Watchman for My Servers

ghostfoxgod1 pts0 comments

I Built a Watchman for My Servers (It Even Opens Its Own Fix PRs) | Akash Rajpurohit

Switch to dark mode Switch to light mode<br>Mobile Nav Hamburger

The fear nobody warns you about

I run a few small products on my own. Writing the code was never the scary part. The scary part is everything that can quietly go wrong once it’s live and you’re not looking.

A background job silently piles up. A query that used to be fast isn’t anymore. An endpoint starts throwing errors on one weird edge. None of it pages you. None of it shows up until a customer hits it, or you happen to glance at a dashboard and feel your stomach drop.

When you’re a team of one, there’s no on-call rotation. You are the rotation. And you can’t sit and stare at Grafana all day, you’ve got a product to build.

So I built something to do the staring for me. I’ve been calling it The Sentinel .

Akash RajpurohitTwitter verified<br>@akashwhocodes ·<br>Follow

Open Tweet<br>A bug in my production app got fixed yesterday.

I didn't write the fix. I didn't even know about the bug.

Here's the full story 👇

I run few small products solo. The hardest part isn't building, it's keeping watch once things are live. Errors, failing jobs, APIs silently

Reply to Tweet1Like Tweet11Retweet

What it actually is

Nothing exotic. A handful of Python scripts on a little box at home, a cron job firing them every few minutes, and the usual open-source tooling doing the boring work underneath, Prometheus ↗️ for metrics, Loki ↗️ for logs, that sort of thing. The scripts pull together what’s happening across the servers and apps that actually matter, and hand it all to one place that looks at it properly. That’s where I plug in Claude ↗️ to do the thinking, and it’s the bit that makes this more than another dashboard.

Everything pushes to Telegram, because the bar for “tell me something is wrong” should be a notification on my phone, not a tab I have to remember to open.

The flow: the sentinel watches the servers and apps and pings me only when something genuinely matters, I approve a closer look, it investigates the live system read-only, and for a real bug it opens a fix PR.<br>Rule one: shut up unless it matters

Most monitoring gets one thing wrong, and it’s the thing I cared about most. Noise.

A system that cries wolf is worse than no system at all, because you learn to ignore it, and then you ignore the one alert that actually mattered. I’ve muted enough chatty alerts in my life to know exactly how that movie ends.

So the sentinel runs on one strict rule: only talk to me when something that was supposed to happen didn’t. No “everything’s fine” pings. No panic because CPU touched 80% for four seconds. If a backup didn’t run, if a queue is genuinely stuck, if spend suddenly jumps, if a brand-new error nobody’s seen before just showed up, that’s worth my attention. A bot poking at random URLs is not.

💡 The whole philosophy in one line

If it can’t be acted on, it doesn’t get sent. A quiet system you trust beats a<br>chatty one you tune out.

Getting this right took real tuning. The first version pinged me for things that fixed themselves a minute later. So now it waits, checks whether a problem actually sticks around, and only then bothers me. Most blips never reach my phone, which is exactly the point.

It doesn’t just alert. It investigates.

This is the part I’m a little too proud of.

An alert is a starting point, not an answer. “Latency spiked on this endpoint” tells you something’s wrong, not what or why. Normally that’s where your evening begins: sshing around, grepping logs, slowly piecing it together.

With the sentinel I just reply to the alert. Right there in the chat. “What actually happened here?”

And it goes and finds out. Read-only, it pokes around the live system, the logs, the database, the metrics, and comes back with a real answer: what broke, why, the evidence, and what to do about it. Not a guess. The actual thing, dug out of production.

The first time it handed me a root cause I’d have spent an hour chasing myself, I genuinely laughed.

And then it opens the fix

Okay, this next bit still feels a little like cheating.

When the digging turns up a real, durable bug, something in the code that’s going to keep biting until it’s fixed, the sentinel can go one step further. It writes up a proper issue and opens a pull request with the fix. I just read it and merge.

It’s gated, on purpose. Nothing happens without me. I get the alert, I approve the dig, it investigates, it decides whether the thing is genuinely worth fixing, and only then does it open the issue and the fix. I’m always the one who says ship it. But the grunt work, the finding, the writing-up, the first draft of the fix, all of that happens without me touching a thing.

Something that spots the problem, works out whether it’s even real, and then hands me the fix. I’m still getting used to that.

PROMOTED

Built & launched by me

The intent layer for B2B outbound

CatchIntent spots real...

something thing real sentinel actually system

Related Articles