AI Workflows in Production Without Burning Tokens

Bringing AI Workflows into Production without Burning Tokens - Unmeshed Back to BlogBringing AI Workflows into Production without Burning Tokens Adopting AI (or the abilities of an LLM) into production is a core metric or goal for most engineers today. In this article we look at the best way to bring AI into production while keeping the token costs under control such that the cost vs benefit equation lands in the benefits bucket and adds value to the business.

Authored by Gulam Mohiuddeen Software Engineer

10 min read June 22, 2026

Let’s make it Agentic!

The push in the market is to use agentic flows. Agentic is when you let a model decide how to process a request or a flow and expect its abilities to parse and understand context to result in the best possible outcome for the use case. The idea is that as models mature and become more “intelligent” the outcomes become more high quality and beats a human coded fixed algorithm.

With this in mind, oftentimes you’d see a use case pushed into production which relies on model calls completely.

For example, an agent may execute a use case by parsing input, validating data, classifying the request, checking policy, routing it to the right person, and drafting a response, all by calling a model.

It's often quite fast to build this, with the many agentic frameworks out there today and the demo is usually great and impressive to the management.

Launching this in production in a high volume use case will bring a shock though - when the bill arrives from the model providers. Token costs are increasing rather than decreasing as models evolve. Now is that use case adding sufficient value to the cost of running it?

What about questions such as consistency, latency, security and governance? And an even bigger question, do you really know why a decision was made?

I think there is a trend shift happening now in the market. The shift is to double check token spend to value creation. This shift is primarily among those who have already shipped a reasonable amount of use cases leveraging AI. The camp that is still working to deploy some use cases is not bothered by the spend yet as it hasn’t really hit them yet. But it's almost a certainty that once your budget starts to get eaten up, the question will come.

Going to Production with AI

The instinct to use AI for everything was not wrong. But does it always make sense? Teams and people are starting to ask which of the steps actually needs “intelligence” and which ones just need some rules or logic? This leads to answers for not just token spend, but the latency and consistency as well. Consistency means you know the reasons why the system is doing something.

But if we are not using AI, are we not losing out? Isn’t that what everyone says now? Get onboard or be left behind to be eaten by more modern competing companies.

The solution is to maximize the use of AI, but in a way that it yields maximum value and not just blindly at everything. I think an example is overdue for explaining this.

Expense approvals - This is quite common and every company needs it and typically it's done by a couple methods:

A human manually reviews and approves each expense based on some published policy

Some rules in a HR system that can automatically approve some expenses and route others to approvers

Let’s say the finance team of a company wants to be more dynamic, rapidly respond to changing trends and create a system that can benefit the company to maximize employee productivity - by letting them manage expenses that are not bound by rules set in stone!

An engineering team asked to build this could simply do this - ask the finance team to write the policy in a Google Doc or something - which can then be published to the internal portal as the official policy and then say build an AI agent that reads this policy and approves every request based on the policy. Now the finance team can update the policy every now and then, and without any developer in the loop, the policy can reflect on each expense approval request - Et Voila! Cool right?

Steps:

User initiate a chat with an expense agent

Exchange greetings (of course we humans always do that, agent or not!)

Upload a receipt, explain the expense

Agent parses the receipt, validates the amounts and dates

Evaluates the entire expense policy against the request

Decides on the request, informs the user

If approved, make a request to the HR system to note the required expense reimbursement

This is a pretty cool agentic flow if you were to build it and I think the finance team and the entire company is probably going to be thrilled to use it. First, policies will start being practically applied (assuming the AI is intelligent) and the finance team has the flexibility to change it every Monday if they want to. Win win - And the CTO can present to the board on how they leveraged the intelligence of the models available today to add value to the business.

The big savings here is the...

AI Workflows in Production Without Burning Tokens

Related Articles

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

How to Earn a Billion Dollars

Italy's Meloni says Trump 'made up' story that she 'begged' him for photo at G7