The Slack message I built EZLogs to kill

The Slack message I built EZLogs to kill — EZLogs Blog

Blog · May 30, 2026

The Slack message I built EZLogs to kill.

At three jobs in a row, the same message kept landing in the engineering channel: “hey can someone check what this user did yesterday for the support thread I’m on?” The dance after that was always the same.

Open Kibana, or production logs, or whichever centralized log thing the company had. Grep for the customer’s email. Scroll past three hundred lines of SQL and Rack noise. Find the request. Trace it into the background jobs it spawned. Translate the result into a sentence a person could read. Paste the sentence into the support thread. Get pinged two hours later by someone else asking about a different user.

Once a week, fine. Five times a day, it’s a job nobody listed in the offer letter.

The thing that finally clicked for me wasn’t that the tooling was bad. It’s that the role was wrong. Engineers were the translators because nothing else could read the logs. The production-log file is a plain-text dump intended for the human who wrote the code. There’s no story in it, no chronology of one customer’s action across HTTP and jobs and DB writes, no plain-English summary. The dashboard tools that exist on top of logs — Datadog, New Relic, Honeycomb — tell you whether the system is healthy. They don’t tell you what one customer did.

I got tired of being the translator, so I wrote the thing that translates.

What I shipped

EZLogs is a Rails gem and a server. The install is two lines:

bundle add ez_logs_agent<br>rails generate ez_logs_agent:install

The generator drops one initializer that reads ENV["EZLOGS_API_KEY"]. No per-controller code. No include Auditable on your models. No manual correlation IDs to thread through your code.

The gem hooks into three places in Rails:

Rack middleware for HTTP requests. Captures the path, status, duration, current user if your app sets one, and the params (sensitive fields redacted).

ActiveJob and Sidekiq lifecycle hooks for background jobs. Captures the job class, queue, arguments, outcome, and timing for everything from perform_now to perform_later across processes.

ActiveRecord callbacks for database writes. Captures the model, primary key, operation, and before/after values for the columns that changed.

Events are buffered in memory and shipped out-of-band over a separate HTTP connection. The agent never raises into your request path. If the EZLogs server is unreachable, the agent buffers up to ten thousand events and retries with backoff. If the buffer overflows, the oldest events drop. I’d rather lose the oldest events than 500 a customer.

The server is where the magic happens, and the magic is boring on purpose. Every event from one user’s action shares a correlation ID. That ID travels from the HTTP request into the jobs the request enqueued, into the DB writes those jobs make, and out the other side. The server stitches the three streams together and produces one card per user action — not one per request.

The card looks like this:

The title is a sentence. Created order #4300. Triggered by Sarah. The body has the entities the action touched, the outcome, when it started, and how long it took. The What happened behind the scenes section is collapsed by default and contains the underlying events — the HTTP POST to /orders, the database change that created the row, every job and email that fired afterward.

That’s the part the support team opens to answer “what happened to this order?” That’s the part the product team reads when they want to understand a flow. That’s the part you can paste into a Notion doc when you’re explaining a bug to the CEO.

Sensitive fields are redacted by default. Anything matching password, token, secret, key, or any *_at timestamp column is dropped before the event leaves your process. You can extend the redaction list in the initializer. The reason redaction is on by default and not opt-in is that the audience for the card is the whole company. If the CS lead has to wonder whether a card is safe to send to the CEO, the gem has already failed.

It’s not metrics. It’s not APM. It doesn’t replace Datadog and doesn’t try to. If you want p99 latency graphs, this is the wrong tool. If you want logs your CS team can read without asking an engineer, that’s the whole pitch.

The first week

I launched EZLogs on Wednesday. r/rails first, then r/ruby, then a Twitter post from a brand-new account, then a Hacker News attempt that the algorithm downweighted because the account was twelve hours old. The Reddit posts landed in the boring way good launches do: a few thousand views, a handful of upvotes, a comment thread that asked the load-bearing question (“is this usable without an LLM?” — yes, the core pipeline is deterministic templates).

Two days in, I went back to...

The Slack message I built EZLogs to kill

Related Articles

It's Not Just X. It's Y

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy

SpaceX not the behemoth everyone thought