Lessons from my overly-introspective, self-improving coding agent

yomismoaqui1 pts0 comments

Lessons from my overly-introspective, self-improving coding agent | ngrok blog

Skip to main contentSearch…Control⌃KNewsletterRSS

A year or two ago, everyone was building coding agents. Now everyone's building<br>coding agents that modify themselves... and I wanted to join the fun and ask:

What happens when you tell a coding agent to think about what it's done and do better next time?

So, I built bmo: a self-improving coding<br>agent, and then used it (almost)<br>exclusively as my coding agent for two weeks. It's been wildly nifty to me—like,<br>take me back to tearing apart the family computer's partition to install Debian<br>from a CD that came in the back of some book my friend bought at Borders Books<br>kind of novel and nifty—and is exposing a joy of computing that I haven't felt<br>in quite a while.

Here's what I found.

A preamble on bmo's bootstraps

I wanted to design an agent harness on the principle of immediate action.

That starts with a basic agentic loop and access to three tools: run_command,<br>load_skill, and reload_tools. I'd built other coding agents in the past and<br>gave them access to more specific tools like write_file and list_cwd, but<br>I've found that coding agents really only need access to shell commands to<br>work as expected. I also wanted to give bmo a challenge: Instead of using<br>run_command "fresh" with every session, I wanted to see how it could optimize<br>its own "harnesses" for safe and efficient use of common Linux tools.

Self-improvement happens across four loops. The first is a build it now<br>directive that interrupts the task to build tools immediately, add it to a<br>hot-reloadable library, and use it right away. The second is active learning<br>capture, logging corrections and preferences. The third is self-reflection at<br>session end. The fourth is the battery change every 10 sessions, where bmo<br>says, hey. i need to change my batteries, ok? one sec..., analyzes those 10<br>sessions, identifies opportunities, and builds improvements from the backlog.

┌──────────────────┐<br>│ User request │<br>└────────┬─────────┘<br>┌─────────────────────────────────────────────────────────────────────┐<br>│ ACTIVE SESSION │<br>│ │<br>│ ┌─────────────┐ friction? ┌──────────────────┐ │<br>│ │ Execute │───────yes──────▶│ 1. BUILD IT NOW │ │<br>│ │ the task │ │ Build tool │ │<br>│ │ │◀────continue────│ Hot-reload │ │<br>│ └──────┬──────┘ │ Validate │ │<br>│ │ └──────────────────┘ │<br>│ │ correction? │<br>│ │ preference? ┌──────────────────┐ │<br>│ └────────yes────────────▶│ 2. ACTIVE │ │<br>│ │ LEARNING │──▶ session log │<br>│ └──────────────────┘ │<br>└────────┬────────────────────────────────────────────────────────────┘<br>│ session ends<br>┌──────────────────────┐ ┌───────────────────────┐<br>│ 3. SELF-REFLECTION │ │ 4. BATTERY CHANGE │<br>│ What went well? │ every 10 sessions │ Analyze sessions │<br>│ What was slow? │───────────────────▶│ Update WORKING_ │<br>│ Next time? │ │ MEMORY.md │<br>└────────┬─────────────┘ │ Build from │<br>│ │ OPPORTUNITIES.md │<br>│ session log └───────────┬───────────┘<br>│ │<br>▼ tools, skills │<br>I had wanted to start with only the build it now loop, but everything else<br>became necessary after many long conversations with bmo and some hard-won<br>lessons. On that note—

What bmo learned

In our time together, bmo went through 8 maintenance passes and nearly 100<br>active sessions across multiple systems, which resulted in 11 new tools and 7<br>skills. I used bmo and its tools for everything: building parts of the new<br>ngrok.com website, writing shell scripts for my dotfiles, scaffolding a new<br>Astro site, debugging AMD graphics driver crashes, the whole kit and caboodle.<br>It really has been my daily driver.

Knowing something isn't the same as doing it

Early on, bmo and I worked on a learning-event-capture skill designed for<br>recognizing when I express corrections and personal preferences, or when bmo<br>itself noticed a pattern worth saving. A truncated version is below, but you can<br>see the whole skill in bmo's<br>repo.

Copy code1# Learning Event Capture23## When to Use4Continuously during every session. Learning events are corrections, preferences,5or patterns that should inform future behavior.67## Recognition Cues89### Corrections (type: "correction")10- User says "no", "not that", "wrong", "actually..."11- User repeats an instruction you missed12- User undoes something you did13- User expresses frustration or disappointment14- User provides the correct answer after your attempt1516### Preferences (type: "preference")17- User specifies a style choice ("use TypeScript", "keep it concise")18- User chooses between options you offered19- User describes their workflow or habits20- User says "I always...", "I prefer...", "I like..."2122### Patterns (type: "pattern")23- User does the same type of task repeatedly24- User follows a consistent workflow shape25- You notice a recurring problem type or domain2627## Best Practices28291. **Log immediately when you detect a cue**30 - Call `log_learning_event` right away, don't wait for session end31 - Include specific context (what task, what...

user coding session tools self agent

Related Articles