Teaching coding agents to debug Rails memory issues with derailed_benchmarks

Teaching coding agents to debug Rails memory issues with derailed_benchmarks | Superconductor Blog

Back to all postsAt Superconductor, we use Superconductor to build Superconductor, and one of the great things about this is that it makes it easy to spin up investigations in the background when you come across issues.

One issue that we’ve been experiencing for a while was that our server memory usage would steadily increase over time until we either redeployed, or the container ran out of memory, restarting the process. This was a perplexing issue but hadn’t been pressing because we deploy frequently, and the container restarts would push off the problem for a while.

However, while debugging a different issue, I noticed that the memory bloat was actually reproducible on any page, so I tasked some coding agents with running our application under derailed_benchmarks to profile the memory retention.

If your memory utilization graph looks like this, you may have a memory leak. I’ve known about derailed_benchmarks for some time. But configuring the app to run locally in production mode was enough friction that I never got around to setting it up. I didn't want to break my local development environment or affect my work on other tasks. This is a perfect task for background coding agents, because they can identify where your app breaks in a simulated production environment and iterate until it works. As evidenced by projects like autoresearch, agents are great at knocking down objectively measurable tasks autonomously, and with Superconductor, it’s easy to spin up multiple agents on the same task, compare results, and verify the changes.

I’d previously assumed that the memory bloat was due to some heavy authenticated pages like the implementation conversation view. So on my first attempt, I instructed the agents to reproduce the issue by populating the conversation view with synthetic data. The agents turned up a handful of different issues, but it was not immediately clear which of the issues were real and which were overblown.

Among the reported fixes were things like preloading N+1 queries (good to do, but not a memory issue), using jemalloc (we already do), and reducing performance instrumentation (plausible, but trades performance for observability, and does not explain retained memory). One other suggestion was to avoid allocating a heavy TailwindMerge::Merger object on CurrentAttributes, but CurrentAttributes is cleared between requests, so I initially dismissed it as the source of persistent memory bloat. Not every agent was successful in configuring derailed_benchmarks to run in production mode, so many of them also surfaced development-only issues that would not be relevant in production. Most agents were convinced that the issue was memory fragmentation, and not a memory leak. With multiple differing suggestions and no clear culprit, I didn’t follow up on that investigation right away.

The initial investigation produced several plausible explanations. Can you spot the real problem? Later, while debugging a different issue, I realized that the memory bloat was occurring on every page, including static pages like the splash page. With the newly simplified reproduction steps, I launched a ticket to look at splash page rendering. This time, 4 out of 5 implementations pointed to the same issue with TailwindMerge, complete with detailed benchmark results. Since the splash page was a much simpler profiling target, we didn't get any spurious suggestions. The benchmark results over many requests gave us confidence that they had identified a real memory retention issue and not something more nebulous.

The simplified splash page benchmark pointed to retained memory in TailwindMerge. All the coding agents suggested similar mitigations in our application code: move the TailwindMerge::Merger off of CurrentAttributes and reuse it through thread-local storage instead. I made sure the change made sense and tested it in a staging environment to verify that it did indeed solve the memory bloat.

Before, memory usage climbed steadily across repeated requests. After deploying the fix, the same workload stabilized after a few requests. After verifying the mitigation, I pointed some agents towards the gem in question, tailwind_merge. The agents were able to find the root cause: the configuration merging code was mutating a global defaults constant when instantiating objects and holding a reference to each instantiated object in that global configuration, so it was a memory leak after all. We iterated on a fix and added tests and a benchmark script to reproduce the issue. We submitted a detailed PR upstream so that other users of this gem don't experience this issue.

Coding agents found the memory leak, and the guided review explained the fix in detail. Now that we have a working setup with derailed_benchmarks, we turned that process into an Agent Skill in our repository so that future memory leaks are much easier to...

Teaching coding agents to debug Rails memory issues with derailed_benchmarks

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy

SpaceX not the behemoth everyone thought

The Mirror Is Part of the Machine