Show HN: Crashguard.io – canary-oriented workflow monitoring

felix-the-cat1 pts0 comments

Hi All,I was trying to monitor some janky asynchronous workflows in a system and got fed up with trying to do it with Open Telemetry and Splunk, so I built a little system for doing it differently.CrashGuard is canary-oriented system for monitoring distributed workflows. Instead of monitoring logs or watching for error events, you set a deadline (a canary ) when a workflow starts, and resolve it when the workflow actually finishes — regardless of how many services it passes through in between. If the canary doesn t resolve in time, CrashGuard knows something is late or stuck, even if every individual system along the way logged a success.Key pieces:- Canaries: a deadline tied to a workflow, created at the start and resolved at the end. One canary can span an arbitrary number of systems/services.- Checkpoints: optional, named progress markers any system in the workflow can record along the way, without resetting the deadline. When a canary does expire, its checkpoint history shows exactly how far the workflow got ( failed after payment cleared, before fulfillment ) instead of just it failed. - Verifiers: by default CrashGuard trusts the caller s resolution, but you can attach a verifier — your own HTTP endpoint — that CrashGuard calls on expiry before declaring failure. The verifier checks whatever actually matters (order totals, payment amounts, file landed correctly) and returns Resolve / Extend / Trigger depending on what you need the system to do.- Stream Deck integration: My favorite feature, it comes with a plugin for Elgato Stream Decks which allows you to configure buttons and dials to monitor individual canaries, metrics, and so on in near real-time. If a button flashes red you can click on it to go directly to that canary s information in the admin app.It s self-hosted, open source (LGPL3) and currently supports alerting to Slack, Email, or webhooks.I d love to hear any feedback, suggestions or advice.

canary workflow system quot crashguard monitoring

Related Articles