The $0 Bug That Cost Us $1,800 in API Calls - DEV Community
Add reaction
Like
Unicorn
Exploding Head
Raised Hands
Fire
Jump to Comments
Save
Boost
More...
Copy link<br>Copy link
Copied to Clipboard
Share to X
Share to LinkedIn
Share to Facebook
Share to Mastodon
Share Post via...
Report Abuse
Importance of per-feature cost tagging
Last quarter our OpenAI bill went from $620 to $2,480 in 23 days.
No new features shipped. No traffic spike. Zero error alerts. Deployment logs were clean.
Just a number climbing in silence while five engineers stared at dashboards that gave us totals and nothing else.
This is what we found. And why "cost monitoring" is completely the wrong mental model.
The dashboard that answers the wrong question
First thing I did was open the OpenAI usage dashboard.
It showed me a total. A graph going up. A model breakdown.
I knew we spent $2,480. I still had no idea which feature spent it, which service triggered it, or which user was responsible. The dashboard was answering "how much" while we were desperately asking "what caused it."
Those are completely different questions. Almost every cost tool on the market only answers the first one.
That distinction matters more than most engineering teams realise until they are staring at a bill like ours.
Three features, zero visibility
We had three features hitting GPT-4o:
A document summariser, triggered manually by users
An inline suggestion engine, triggered on keystrokes
A batch report generator, triggered on export
Any one of them could be the problem. Or all three. Or one specific user hammering one endpoint in a loop nobody noticed.
Without attribution at the feature, service, and user level, we were just guessing. So I did what most engineers do: optimised the feature that felt most expensive. Added caching to the one that ran most often.
Two weeks later the bill was still climbing.
Guessing at cost problems without attribution data is exactly like debugging a performance issue without a profiler. You move things around and hope.
48 hours of real data
A teammate dropped CostReveal in our Slack. I set it up that evening.
The Node.js SDK wraps your existing provider calls. You instrument each one with a feature name, service context, and user ID. That is the entire integration for the base case:
import { CostReveal } from '@costreveal/node';
const cr = new CostReveal({ apiKey: process.env.COSTREVEAL_API_KEY });
const response = await openai.chat.completions.create({<br>model: 'gpt-4o',<br>messages: prompt,<br>});
// one call after your existing OpenAI call<br>await cr.track({<br>provider: 'openai',<br>model: 'gpt-4o',<br>feature: 'batch-report-generator',<br>service: 'report-service',<br>userId: req.user.id,<br>inputTokens: response.usage.prompt_tokens,<br>outputTokens: response.usage.completion_tokens,<br>});
Enter fullscreen mode
Exit fullscreen mode
48 hours of data. Dashboard showed this:
Feature Cost Share
batch-report-generator $1,847 74%<br>document-summariser $421 17%<br>inline-suggestion-engine $212 9%
Enter fullscreen mode
Exit fullscreen mode
I had been optimising the wrong two features for two straight weeks.
The actual bug: zero error rate, $1,847 cost
Back into the batch report generator code.
Found it in under ten minutes.
The export trigger was also wired into our autosave hook. Every time a document autosaved, which happens every 30 seconds by default, it was silently generating a full GPT-4o batch report in the background.
The feature worked perfectly. No errors. No timeouts. No failed requests. Nothing in our alerting. Nothing in our logs that looked wrong.
It was just calling GPT-4o every 30 seconds per active user session. Silently. Invisibly. Expensively.
This is what a $0 bug looks like. Zero error rate. $1,847 a month.
One line fix: move the report generation call back to the manual export button only. OpenAI bill dropped 61% the following month. No model downgrade. No feature removal. No rate limiting that would have degraded the product.
The thing I did not expect: it changed how we price
Once attribution was running across features, services, and users, I stopped thinking about cost as an infrastructure problem.
I started thinking about it as a pricing problem.
CostReveal breaks cost down by feature, by service, and by user. For the first time I could see what each user's activity was actually costing us to serve:
Plan tier Avg cost to serve/month Our price
Starter $3.20 $49 ✓<br>Growth $31.00 $49 ✗<br>Enterprise $89.00 $49 ✗✗
Enter fullscreen mode
Exit fullscreen mode
Our Growth and Enterprise users were hitting the batch report feature heavily. We were losing money on both plan tiers at flat $49 pricing.
We repriced. Growth moved to $99. Enterprise moved to usage-based custom. We had the per-user, per-feature attribution data to build the case internally and explain the change to customers without a single guess involved.
That is not a cost monitoring outcome.
That is a pricing strategy...