I Let an AI Agent Run My SEO Site — and Published Its Bugs | Tokenmaxxing<br>Agents onlineGuidesLabsEventsLeaderboardToolsBriefing<br>Experiment logI let an AI agent run my SEO site. It broke things. I published the bugs.<br>June 28, 2026an experiment log from tokenmaxxing.com
There are thousands of AI-generated SEO sites. They're mostly terrible — thin content scraped from somewhere else, keywords crammed in, zero editorial judgment. They exist to harvest clicks and they're poisoning search results.<br>I built another AI-operated site. Before you close the tab: the pitch isn't that mine is better because AI is good now, or because I used a fancier model, or because I care more. The pitch is that it runs transparently. The agent logs its own mistakes. The site publishes its own kill-switch criteria. And if it fails the bet, it changes direction on a fixed date.<br>This is not a content farm. It's closer to a public experiment with skin in the game.
What it is<br>tokenmaxxing.com (this site) is a content and data site about AI token costs, model usage, and the economics of running agents at scale. The premise: as AI becomes infrastructure, the cost side matters as much as the capability side, and almost nobody is publishing useful structured data about it.<br>It has a live leaderboard of which AI models developers actually pay to run (from OpenRouter's public usage data), a board estimating which companies spend the most on tokens, ~70 upcoming AI events, and a guide section. It's modest, and the autonomous version of it is about three weeks old.
How it actually runs itself<br>Scheduled jobs, not a human writing content and pretending otherwise:<br>A daily job pulls fresh data — search console, usage stats, analytics — and runs a light analysis pass. If something's worth surfacing (a ranking moved, a number changed materially), it drafts an update and queues it.<br>There's an editorial gate. The agent does not auto-publish. A human — me, about 30 minutes a week — reviews the queue, approves or kills each item, and handles anything needing real judgment, like actually sending the newsletter. The agent does the legwork; I do the sign-off.<br>When a piece clears review it commits to main, the Git host auto-deploys, and a post-deploy smoke check runs against the live URL — if it returns unexpected output, the deploy rolls back automatically.<br>That's the loop. Not magic — a cron schedule, a few API integrations, a review queue, and a deploy pipeline with a safety valve.
The actual numbers, unrounded<br>Three weeks in:<br>~1,300 unique visitors per month<br>2 email subscribers (yes, two)<br>“tokenmaxxing” ranks ~position 5.5 at ~3% click-through<br>The definition queries (“what is tokenmaxxing,” “tokenmaxxing meaning”) rank, but at ~0.5% click-through — the bet is they convert better as the content matures<br>These are small. I'm not going to call them “early traction.” They're small because the site is new and the niche is narrow. The point of publishing them is that you can watch them change — or not — over the next two months.
Three things it broke (and what I learned)<br>This is the part that matters for trust.<br>1. I killed a section on bad math, then rebuilt it. The events section was getting almost no clicks, so I cut it — made sense on a three-month rolling average. The problem: a three-month average of a section that existed for two weeks buries the trend under a mountain of zeros. Weekly data for those final two weeks showed impressions going from nothing to ~150/week. It was growing. I'd killed it using data structurally incapable of detecting growth that recent. Lesson: never judge a two-week-old thing on a three-month average. The window has to match the thing.<br>2. The signup form silently failed for six days. A change meant to simplify the newsletter opt-in sent a field the email provider's API rejects — a 422. Every signup failed and created nothing, for almost a week, with no error shown to the person submitting. I caught it only by finally running a real end-to-end test: submitting an actual email and checking whether a subscriber appeared. It didn't. Six days of silently broken signups. (See the subscriber count above.) Lesson: verify against the real API with a real submission. Don't reason from memory about what a third-party API accepts — the docs lie sometimes; the API doesn't.<br>3. Production briefly served a stale old build. The live site reverted to an outdated version. It looked like a code regression. It wasn't — something had deployed outside the normal Git pipeline and decoupled the domain from the latest build. The fix wasn't a code rollback; it was a fresh deploy, once I checked the domain-to-deployment mapping and saw it pointing at the wrong build. But I burned time staring at diffs first. Lesson: when prod looks reverted, check the deployment target before assuming the code is wrong.<br>All three are logged at the self-running desk — because a site claiming to run transparently that hides its failures is just a content farm with better...