A month of vibe-coding at 0.01x velocity

A month of vibe-coding at 0.01x velocity - webesque

Exchanging my data for service use, I took up OpenAI's offer for a free trial of ChatGPT Plus for a month. After slowly vibecoding an IDE plugin throughout last month, I'm eager to share my notes and lingering thoughts.

Marius Ghița

19th of June, 2026

This is not the first time I've tried a fully vibecoded project. Last year had me return multiple times to different providers to form my own opinion given the disparity of public opinion. I never walked away before with the feeling that coding is solved.

What follows is a rundown of my experience over the course of the month, followed by my takeaways.

The project, agent operation, and disclosure of preconceptions

Selected a project that was unreasonable for a limited timeframe, given my skills. A plugin for the IntelliJ IDE platform, written in Kotlin. A new domain, and a language I don't use. Small in scope, greenfield project, with no entrenched domain knowledge-checks, for the model, or for me.

As is customary across the industry to build coding harnesses the plugin is another LLM integration plugin, in

the sea of LLM plugins. In few words, a plugin that works well with locally hosted models, using them as ad-hoc fill-in-the-middle capable models, and never proactively interrupting the flow of programming. See the Alternatively prompt GitHub repository for more information.

I believe that coding agents are best used as tools to generate prototypes. Attempting to solve problems that are outside the area of expertise of the operator. A similar belief concerns the practical applications of general purpose LLMs for business integrations.

The setup, and the agent operation

Ask ten people how to best operate an agent and you'll get eleven answers. Maybe you'll also receive suggestions to run agents that run agents which check agents that control agents that actually code. If AI vibe-coding ultimately is the future, I'd like to prompt the way I code,

lazily.

I've kept things simple though not optimal.

And if there is an optimal, companies that are a bit too well funded (pre-IPO) should be the ones enlightening us on the one true way. ™

Best in class model.

Unless limited by the tool (planning mode), always ran GPT-5.5 xhigh (extra high reasoning).

Planning for larger features.

The project had three main project features, and several functional rewrites. Most, if not all, of these went through the general plan then execute flow. In general improvements that only consisted of a single short sentence prompt (change color, add accessibility label, fix prompt for e2e test, etc) skipped the plan phase.

No agent-specific project adjustments beforehand.

No AGENTS.md or equivalent markdown files. Whatever was in the upstream base template repository used as is without review. Later in the process, when the agent forgot how to run tests, was instructed to generate an AGENTS.md file.

The weekly limits, and the effective daily limits

On the Plus plan, as expected, the quotas become obvious quickly. Their terms of service nowadays state those quotas are based on token usage, but when full allocation and utilization are not visible throughout the interface, how do you gauge how long you can work in a session?

Progress wasn't steady every day. Some days I would be fixing an issue or two, and due to a more extensive e2e run I would be running out of requests pretty quickly. I assume in part due to the internal looping of codex to check if processes have finished. If every time it checked that a long running e2e test wasn't done, and if it sent the entire context with each request, I can only imagine how much useless computation it could chew through.

Surprising model behaviour when close to the usage limit.

I quickly learned that I should avoid any meaningful request when closing in on the 5 hour limit. While in the past I've seen model behaviour in which a response would be cut off abruptly (my experience last year when trialing JetBrains' Junie agent), with codex I felt shortchanged. Instead of producing a smaller part due the constraints, the model didn't stray from the prompt but made undesirable shortcuts. Instead of adjusting an end-to-end test as requested, it hardcoded asserts for tests to pass. Behaviour unobserved outside of this scenario.

The weekly limits most of the time would reset sooner than what was advertised in the tool.

I could draft up my own theory on why the dates and times were out of sync, but you won't hear me complain loudly, as this allowed me to work more on this project more often. Good reminder that you can have billions of dollars in funding and still make the most basic mistakes even with PhD-level intelligence at your disposal.

Model degradation during peak hours?

It is a known fact that AI labs are capacity constrained, though how that affects day to day usage remains mostly a guess. In practice what I've noticed, was that around the time US East...

A month of vibe-coding at 0.01x velocity

Related Articles

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

How to Earn a Billion Dollars