The energy efficiency of agent networks

The Energy Efficiency of Agent Networks — VDF AI Benchmark White Paper

BENCHMARK WHITE PAPER v1.0 June 2026 VDF-WP-2026-003 The energy efficiency of agent networks.

A controlled benchmark of how VDF AI reduces the energy footprint of enterprise AI — by decomposing work into DAG-based agent networks and dispatching each step through SEEMR self-evolving model routing . The result: up to a 94.9% reduction in predicted energy, with output quality held non-inferior in aggregate.

Authors VDF AI Research Team Read time 16 min License CC BY 4.0

Download PDF Read Online

ABSTRACT Most of the energy an AI system consumes in production is spent at inference — the same request answered again and again[6]. That energy is not a fixed property of a model. It is the outcome of a decision: which model runs, broken into how many steps, under what objective.

This paper reports a benchmark of that decision inside VDF AI. We compare a high-intensity baseline — one large model answering the whole task — against two compounding strategies: routing each request under an energy-aware objective, and decomposing a workload into a directed graph of smaller, independently-routed stages. Across 71 configurations spanning four token budgets and five scenario families, energy-led routing reduced predicted energy by 81–95% , with a stable ~94.8% reduction for the frontier-versus-compact pairing.

Crucially, savings without quality are meaningless. In a separate execution benchmark with a quality score recorded per task, the routed condition reduced predicted energy by 94.9% while remaining non-inferior in aggregate under a margin fixed in advance — with the task-level exceptions disclosed in full. The contribution here is not a single number; it is an auditable account of how routing and decomposition turn energy into something an enterprise can measure, steer, and defend.

Keywords energy-aware inference · DAG agent networks · self-evolving routing · non-inferiority · Green AI · token-level attribution · sustainable enterprise AI

AT A GLANCE Six numbers from the benchmark

Peak energy avoided 94.9% predicted energy removed by eco routing vs. a pinned frontier baseline

Efficiency multiple ≈20× less predicted energy per workload at the same task, frontier vs. routed

Quality outcome Non-inferior routed quality held within a pre-registered 0.10 margin in aggregate

Benchmark depth 71 configurations across five scenario families and four token budgets

Savings range 81–95% reduction band observed across different model pairings

Selective frontier 54% energy still avoided when one DAG stage deliberately keeps the frontier model

FIGURE 1 The same work, a fraction of the energy

Aggregate of the quality-constrained execution benchmark: a pinned high-intensity baseline versus energy-aware routing, with the quality guardrail satisfied.

Pinned frontier baseline 3.80 Wh

VDF AI — routed 0.19 Wh

94.95% predicted energy avoided ≈20× more efficient, same task ±0.10 quality margin — held

Fig. 1. Predicted energy in watt-hours for an identical task set. Figures are coefficient-based predictions under benchmark conditions, not measured wall power.

SECTION 1 Why inference energy is a decision, not a constant

A model is trained once and served billions of times. The integral of that serving tail now dominates the one-off training spike[6][8], which means the most leveraged place to reduce AI's footprint is the dispatcher that decides, per request, which model runs and how the work is split.

Enterprises increasingly have to attribute that energy — for sustainability reporting, for internal chargeback, and for procurement decisions that no longer accept a single annual number. So the question this paper answers is concrete: if you hold the task fixed and change only the routing and decomposition strategy, how much energy moves? And does quality survive the change?

We answer with a benchmark rather than an assertion. Two forms of evidence are reported: a coefficient-based comparison that isolates the effect of routing policy under fixed token assumptions, and a quality-constrained execution benchmark that pairs each energy figure with a measured quality score. The first tells us how big the lever is; the second tells us whether pulling it costs anything.

SECTION 2 · FIGURE 2 The routing objective is a dial you control

The same candidate pool, three presets. Eco leans into energy; Max-Quality deliberately holds the heavy model. That Max-Quality lands at exactly 0% saving is the point — it proves the savings come from the policy, not from a benchmark quietly favouring the small model.

Frontier-class vs. compact local model Eco energy-led objective

94.8%

Balanced resolves to the same efficient pick here

94.8%

Max-Quality holds the frontier model by design

Heavy tier vs. light tier Eco narrower energy gap between candidates

81.4%

Balanced matches eco for this candidate set

81.4%

Max-Quality holds the heavy tier by design

The...

The energy efficiency of agent networks

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

It's Not Just X. It's Y

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy