A $0 bug silently cost $1,800 in API calls

akeyless1 pts0 comments

The $0 Bug That Cost Us $1,800 in API Calls - DEV Community

Add reaction

Like

Unicorn

Exploding Head

Raised Hands

Fire

Jump to Comments

Save

Boost

More...

Copy link<br>Copy link

Copied to Clipboard

Share to X

Share to LinkedIn

Share to Facebook

Share to Mastodon

Share Post via...

Report Abuse

Importance of per-feature cost tagging

Last quarter our OpenAI bill went from $620 to $2,480 in 23 days.

No new features shipped. No traffic spike. Zero error alerts. Deployment logs were clean.

Just a number climbing in silence while five engineers stared at dashboards that gave us totals and nothing else.

This is what we found. And why "cost monitoring" is completely the wrong mental model.

The dashboard that answers the wrong question

First thing I did was open the OpenAI usage dashboard.

It showed me a total. A graph going up. A model breakdown.

I knew we spent $2,480. I still had no idea which feature spent it, which service triggered it, or which user was responsible. The dashboard was answering "how much" while we were desperately asking "what caused it."

Those are completely different questions. Almost every cost tool on the market only answers the first one.

That distinction matters more than most engineering teams realise until they are staring at a bill like ours.

Three features, zero visibility

We had three features hitting GPT-4o:

A document summariser, triggered manually by users

An inline suggestion engine, triggered on keystrokes

A batch report generator, triggered on export

Any one of them could be the problem. Or all three. Or one specific user hammering one endpoint in a loop nobody noticed.

Without attribution at the feature, service, and user level, we were just guessing. So I did what most engineers do: optimised the feature that felt most expensive. Added caching to the one that ran most often.

Two weeks later the bill was still climbing.

Guessing at cost problems without attribution data is exactly like debugging a performance issue without a profiler. You move things around and hope.

48 hours of real data

A teammate dropped CostReveal in our Slack. I set it up that evening.

The Node.js SDK wraps your existing provider calls. You instrument each one with a feature name, service context, and user ID. That is the entire integration for the base case:

import { CostReveal } from '@costreveal/node';

const cr = new CostReveal({ apiKey: process.env.COSTREVEAL_API_KEY });

const response = await openai.chat.completions.create({<br>model: 'gpt-4o',<br>messages: prompt,<br>});

// one call after your existing OpenAI call<br>await cr.track({<br>provider: 'openai',<br>model: 'gpt-4o',<br>feature: 'batch-report-generator',<br>service: 'report-service',<br>userId: req.user.id,<br>inputTokens: response.usage.prompt_tokens,<br>outputTokens: response.usage.completion_tokens,<br>});

Enter fullscreen mode

Exit fullscreen mode

48 hours of data. Dashboard showed this:

Feature Cost Share

batch-report-generator $1,847 74%<br>document-summariser $421 17%<br>inline-suggestion-engine $212 9%

Enter fullscreen mode

Exit fullscreen mode

I had been optimising the wrong two features for two straight weeks.

The actual bug: zero error rate, $1,847 cost

Back into the batch report generator code.

Found it in under ten minutes.

The export trigger was also wired into our autosave hook. Every time a document autosaved, which happens every 30 seconds by default, it was silently generating a full GPT-4o batch report in the background.

The feature worked perfectly. No errors. No timeouts. No failed requests. Nothing in our alerting. Nothing in our logs that looked wrong.

It was just calling GPT-4o every 30 seconds per active user session. Silently. Invisibly. Expensively.

This is what a $0 bug looks like. Zero error rate. $1,847 a month.

One line fix: move the report generation call back to the manual export button only. OpenAI bill dropped 61% the following month. No model downgrade. No feature removal. No rate limiting that would have degraded the product.

The thing I did not expect: it changed how we price

Once attribution was running across features, services, and users, I stopped thinking about cost as an infrastructure problem.

I started thinking about it as a pricing problem.

CostReveal breaks cost down by feature, by service, and by user. For the first time I could see what each user's activity was actually costing us to serve:

Plan tier Avg cost to serve/month Our price

Starter $3.20 $49 ✓<br>Growth $31.00 $49 ✗<br>Enterprise $89.00 $49 ✗✗

Enter fullscreen mode

Exit fullscreen mode

Our Growth and Enterprise users were hitting the batch report feature heavily. We were losing money on both plan tiers at flat $49 pricing.

We repriced. Growth moved to $99. Enterprise moved to usage-based custom. We had the per-user, per-feature attribution data to build the case internally and explain the change to customers without a single guess involved.

That is not a cost monitoring outcome.

That is a pricing strategy...

cost feature report user share openai

Related Articles