Large Language Models Are Overkill. Enter the Small Language Model

Large Language Models Are Overkill For Some Marketing Tasks. Enter The Small Language Model | AdExchanger

image/svg+xml:

Home AI Large Language Models Are Overkill For Some Marketing Tasks. Enter The Small Language Model

Thursday, June 25th, 2026 – 1:05 am<br>SHARE:

It’s no secret that large language models (LLMs) have gotten exorbitantly expensive.

Companies are starting to limit their employees’ AI usage to save money; OpenAI has even discussed lowering the cost of tokens to retain financially anxious customers.

But you know what’s cheaper than a large language model? A small language model.

AI company ZeroGPU develops small language models (SLMs) that are trained on a smaller amount of data and designed to perform specialized tasks and use cases.

On Thursday, the company announced a group of specialized SLMs for ad tech, with the goal of helping tech companies handle high-volume workflows more quickly and at a lower cost.

Work smaller, not harder

LLMs have “trillions and trillions of parameters,” and they’re trained on the entirety of the internet, said Maddy Arvapally, founder and CEO of ZeroGPU.

But for a lot of repetitive ad tech tasks, like content classification or document summaries, a much smaller model with fewer than 10 billion parameters is enough to get the job done, she said. Plus, using an SLM is cheaper and faster than having an LLM perform the same task.

Because of the sheer amount of data processing they require, LLMs rely on high-powered graphics processing units (GPUs) to feed them data and constantly revise the model’s parameters. But enterprise-grade GPUs are expensive, so many tech companies rent access to these GPUs from cloud infrastructure providers. Either way, the GPU costs get passed down to the end user.

ZeroGPU’s smaller models, however, run on central processing units, which are cheaper and handle tasks one at a time. They can also run on browsers.

Because SLMs carry lower processing costs, AI monetization company Dappier has seen a 50% decrease in its overall expenses since adopting ZeroGPU’s models, according to Co-Founder and CEO Dan Goikhman.

Dappier provides on- and off-site AI agents for marketers (basically, brand-specific chatbots) that are trained on their brand tone and guidelines. It also licenses publisher data for training AI tools.

So far, Dappier has adopted three of ZeroGPU’s SLMs: one for content classification, one for intent classification and one for moderation (or brand safety).

Keeping up with the times

The marketing-specific agents Dappier creates need to be “super responsive” to customer queries and follow-ups, said Goikhman, including the ability to classify each conversation and extract the user’s “commercial intent.” The intent could be anything from seeking out a particular type of product or wondering how this brand stands out from its competitors.

Say, for instance, a user is reading an article on a parenting website about helping their child with their homework. Dappier can create a chatbot that can generate prompt suggestions to open a conversation with that user about parenting.

But the SLMs continue to analyze the context of the conversation and the user’s intent as the interaction evolves, said Goikhman. And this enables the agent to constantly map both the content the user is engaging with and the conversation they’re having with the chatbot to IAB contextual categories.

The goal is to show publishers and advertisers what the article and the resulting conversation are about, he said. That way, they can understand what conversational prompts to show to keep the user engaged and what sorts of advertisers might be interested in that page and the ensuing chatbot interactions.

The final frontier?

Changing the models that your AI tools run on sounds like a heck of an undertaking. But ZeroGPU prioritizes easing the transition to SLMs for its clients.

Its models have “OpenAI-compatible endpoints,” said Arvapally, meaning that the only thing a client needs to do is swap out a URL on the backend so it calls ZeroGPU’s API rather than OpenAI’s.

According to Goikhman, the entire process took about five minutes.

Historically, Dappier was using the major frontier models like OpenAI and Claude to generate prompts and understand context. But the SLM’s results are still tailored to the customer’s needs, and the AI doesn’t lean on as many additional resources in its training.

For instance, Dappier tracks all of the conversations that consumers have within its chatbots to determine best practices for prompt suggestions. But its SLMs are trained just on these conversations and practices.

Another of Dappier’s main use cases is classifying articles and conversations within IAB categories. The IAB content taxonomy includes more than 1,500 categories, and the SLM is trained on all...

Large Language Models Are Overkill. Enter the Small Language Model

Related Articles

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

Britain Became as Poor as Mississippi