Vibecoding a high performance system
Andrew Chan
Contents
Vibecoding a high performance system
Discussion on r/programming.
-->
There's been a thousand posts about vibe coding. Some case studies from Indragie Karunaratne,<br>Edward Yang, and Cloudflare<br>stuck out to me. These are all projects where it's something the creator is already an expert in, or a read-only app where bugs are low-impact, or a well-known standard with a small<br>design space.
I recently used agentic coding to build a system to crawl a billion web pages in around 24 hours. Here's what's different:
The core concept is simple, but at scale, the design space is large.
There are parts where bugs could be really bad, like politeness.
The goal was to achieve a new level of an objective metric (throughput).
I wrote I'm respecting Michael Nielsen's precedent and holding off on releasing<br>the whole repository. I'm open-sourcing a subset of the chatlogs (which contain large chunks of code that the agents wrote) instead. Yes, I know this hurts the epistemics of my argument. You'll just have to trust me that the system works and was built as advertised.<br>and đź’¬ links to some chatlogs.
Spoiler : it was a huge boost overall. I'll argue that the learnings apply when building similiar high-performance systems, with some caveats.
1. My setup
I used Cursor, just the IDE in agent mode, mostly with Claude. No background agents (which were early preview when I started). Somehow this has become boring: Steve Yegge calls it<br>“chat coding” and says it'll be ancient history by Q3. I stuck with it because:
I'm the type of programmer who doesn't mess with their tools too often, and prefers to instead go deep on the problems I'm solving.
The tab autocomplete is still best-in-class. I found that I needed to stay in touch with the code for many reasons, and tab + other UI features kept me in the loop while still providing a big boost.
I've tried Claude Code a bit. It's good! I used it to make pbf-bench and my friend Sawyer is building a whole toolchain<br>on top of it. So far I've felt that it's best for small projects where I don't need to review the code and some terminal tasks. I expect I'll be using it more and more but with Cursor, not instead of Cursor.<br>Maybe that could change in a year or so.
How much did I code with agents vs. handwrite? According to my commit messages, out of 32k lines of code changes in this project...
-->
% claude -p "give me a command or script to count the number of lines changed in .py files over all commits"<br>Here's a command to count lines changed in .py files across all commits:
```bash<br>git log --numstat --pretty="" -- "*.py" | awk '{add+=$1; del+=$2} END {print "Added lines: " add "\nDeleted lines: " del "\nTotal changes: " add+del}'<br>```
This command:<br>- Uses `git log --numstat` to get line change statistics<br>- Filters for only `.py` files with `-- "*.py"`<br>- Uses `awk` to sum up additions and deletions across all commits
% git log --numstat --pretty="" -- "crawler/**/*.py" "crawler/**/*.yml" "crawler/**/*.sh" "crawler/**/*.json" | awk '{add+=$1; del+=$2} END {print "Added lines: " add "\nDeleted lines: " del "\nTotal changes: " add+del}'<br>Added lines: 20305<br>Deleted lines: 11864<br>Total changes: 32169
-->
...1.2k were handwritten:
-->
% claude -p "give me a command or script to count the number of lines changed in .py files over all commits (filtered to include only commits with '[not vibed]' in the message)"<br>Here's a command to count the number of lines changed in .py files for commits with '[not vibed]' in the message:
```bash<br>git log --grep='\[not vibed\]' --pretty=format:'%H' | while read commit; do<br>git diff --numstat "$commit"^ "$commit" -- '*.py' 2>/dev/null | awk '{added+=$1; deleted+=$2} END {print added+deleted}'<br>done | awk '{total+=$1} END {print "Total lines changed: " total}'<br>```
% git log --grep='\[not vibed\]' --pretty=format:'%H' | while read commit; do<br>git diff --numstat "$commit"^ "$commit" -- 'crawler/**/*.py' 'crawler/**/*.yml' 'crawler/**/*.sh' 'crawler/**/*.json' 2>/dev/null | awk '{added+=$1; deleted+=$2} END {print added+deleted}'<br>done | awk '{total+=$1} END {print "Total lines changed: " total}'<br>Total lines changed: 1253
-->
That's 3.75%!
I started building the crawler as a side project at the very end of May (first with Gemini 2.5 Pro, then mostly Claude 4 Opus), working for a couple hours each night; working through 8 completely<br>different designs and running countless experiments took me through the July 4 week, which I took off work to go full-time on this. In retrospect there are totally ways I could've sped<br>this up by parallelizing with background agents or worktrees, but for reasons I'll describe, my problem was ultimately still bottlenecked by code review and experiments: I estimate I<br>could've shaved a week of part-time work off.
2. The problem
The shape of the problem you're trying to solve will determine exactly how much AI helps you, because it will imply:
Where the bottlenecks are...