How I Audit a Legacy Rails Codebase in the First Week
After doing this enough times, I’ve learned the first week isn’t about reading the code. It’s about reading the signals.
The client already has opinions about what’s wrong. They’re usually partially right and almost always wrong about why. Your job in week one is to separate what looks bad from what’s actually dangerous.
TL;DR: Start with people, not code. The tools come after you know what you’re looking for.
The client bashfully frames everything as “technical debt”, but the codebase actually seems<br>a bit healthy at face value. The test suite runs, even if it’s a bit slow. Deploys happen<br>automatically and regularly, even if they sometimes need a babysitter. But the real issue<br>is that the engineer that gave notice last month wrote the entire checkout flow. They<br>never documented it, and now the team is terrified to contribute to the flow. It’s<br>good code, but it’s complex and underdocumented. The lack of knowledge sharing<br>is the real problem, not the code itself.
Before You Clone the Repo: The Stakeholder Interview
The most diagnostic tool you have isn’t a gem.
You ask when the last time they deployed on a Friday was. They laugh. That laugh<br>tells you more than any code metric could show you. It shows you that deploys are<br>high stakes and that the team is living in fear of breaking production. On further<br>investigation, you learn that there is no safe rollback procedure for deployment.<br>If a deploy goes wrong, the team has to scramble to fix it in production. This is something static analysis<br>won’t show you, but it’s also a critical signal of the codebase’s health.
Questions for developers:
“What’s the one area you’re afraid to touch?”
“When’s the last time you deployed on a Friday?”
“What broke in production in the last 90 days that wasn’t caught by tests?”
Questions for the CTO/EM:
“What feature has been blocked for over a year?”
“Do you have real-time error visibility right now?”
“What was the last feature that took significantly longer than estimated?”
Questions for business stakeholders:
“Are there features that got quietly turned off and never came back?”
“Are there things you’ve stopped promising customers?”
Reading the Gemfile, Schema, and Routes
You can form a working thesis in 30 minutes without running a single tool.
The transactions table had 122 columns. That number alone is a signal, but it starts to make a grim kind of sense when you see what’s in there. stripe_charge_id, wire_transfer_reference, ach_routing_number, paypal_transaction_id, every payment processor the company had ever integrated with, each with their own set of nullable columns, all crammed into one table. A wire transfer doesn’t need a stripe_charge_id. A Stripe charge doesn’t need an ach_routing_number. Most rows are mostly null.
The separation of concerns problem is bad, but fixable. The integer primary key is not. The table had been around since the company’s founding. They processed a healthy volume of transactions every day. I didn’t need to run a query to know they were probably sitting somewhere north of a billion rows. The maximum for a signed integer is about 2.1 billion. Nobody had thought about it because the app had always worked. I asked the CTO when they expected to hit that limit. He had no idea the limit existed.
Gemfile : count the gems, look for duplicated responsibilities (two auth systems, two file upload gems), note anything you can’t explain.
db/schema.rb : god tables with 30+ columns, missing indexes on obvious foreign key columns, dead tables with no model counterpart, integer primary keys in an old high-volume app (a quiet ID exhaustion timebomb).
config/routes.rb : total count and ratio of RESTful resources to custom one-off routes. 500 custom routes isn’t a style problem. It’s an architecture one.
Tools I Actually Run
SimpleCov reports 81% coverage. Looks healthy, right? Then you check which files have zero coverage: order.rb, payment.rb, subscription.rb. Three models that touch money have zero test coverage. The 81% was carried by hundreds of tests on utilities, views, mailers, and controllers.
Security: Run First, Non-Negotiable
bundle audit check --update<br>bundle exec brakeman --format html -o brakeman_report.html
Severity matters more than count. One critical CVE in an auth gem is a different problem than 20 low-severity advisories. With Brakeman, focus on confidence level and whether warnings are in high-traffic code paths.
Dependency Health
bundle outdated<br>bundle exec bundle_report compatibility --rails-version=7.2 # via next_rails gem
Name the EOL date. Rails 6.1 went EOL October 2024. In regulated industries, running an EOL version isn’t...