Learning Multi-Agent Coordination via Sheaf-ADMM

We introduce Sheaf-ADMM , a different way to build a neural network based on the notion of multi-agent consensus . The framework is built on the intersection of sheaf theory and ADMM for distributed consensus.

Resources

Paper

Code

Authors

Jeffrey Seely*

Sakana AI

Bartłomiej Cupiał*

Sakana AI<br>U. Warsaw<br>AKCES NCBR

Llion Jones

Sakana AI

Published

July 2026

Limited-view agents negotiating a global answer.

Introduction

AI systems are increasingly composed of many interacting agents rather than a single monolithic model. In current practice, multi-agent systems are typically centralized, such as with an orchestrator delegating and assigning subtasks. In many systems of interest, however, no such central coordinator exists. The nodes of a sensor network, the ants of a colony, or the neurons of a nervous system each observe a small part of the environment and communicate only with their neighbors, and coherent global behavior arises from these local interactions alone .

We wish to study the mechanisms of collective coordination. Our approach is to focus on the problem of distributed consensus — how multiple agents with individual views of data agree on a global state — and to look for inspiration in existing fields that have studied distributed consensus from different angles.

In distributed optimization, the alternating direction method of multipliers (ADMM) splits a global problem into per-agent subproblems; each agent solves its subproblem, reconciles its solution with neighbors, and repeats until the system reaches global consensus. The algorithm is rooted in the theory of convex optimization, but each step admits an elegant interpretation in terms of multi-agent coordination.

A complementary question is what neighboring agents must agree on. Full consensus is often too restrictive; an alternative is to ask agents to agree on only linear projections of their state. Incidentally, this coincides precisely with an object from applied algebraic topology: a network sheaf , which offers topological tools for studying distributed systems, such as harmonic states, topological obstructions to coordination, and more .

Both ADMM and sheaves offer complementary framings for local-to-global coordination. We develop Sheaf-ADMM , which utilizes both in a learnable system. ADMM supplies coordination and negotiation dynamics, and the sheaf structure supplies the notions of inter-agent agreement. In Sheaf-ADMM, coordination evolves by running an ADMM solver across the latent space of hundreds of communicating agents. No agent sees enough of the input to solve the task on its own. The global solution emerges only from the agents' local negotiation.

We establish the method in simple settings: image classification, multi-agent Sudoku, and maze pathfinding. By focusing on simple tasks, we are able to isolate these coordination mechanisms clearly, and make them amenable to investigation.

Paper: arxiv.org/abs/2605.31005

Code: github.com/SakanaAI/sheaf-admm

Sheaves

We first introduce the sheaf component of the framework. Typical message-passing neural networks (MPNNs) use arbitrary learnable nonlinear maps (e.g. MLPs) to pass messages between agents (i.e. between nodes of a communication graph) . Alternatively, a network sheaf gives a simple, linear, and—importantly—highly interpretable implementation of message passing .

The sheaf consensus condition. Two neighboring agents each hold a private state in $\mathbb{R}^2$. Restriction maps $F_{ij}, F_{ji}$ project both into a shared 1-D public communication channel. Consensus is reached when the projections agree, $F_{ij}x_i = F_{ji}x_j$.

In a sheaf, each agent $i$ holds a private state vector $x_i\in\mathbb{R}^d$ (a decision, or a latent representation). Two neighboring agents try to reach consensus — not by agreeing on their entire state vector, but only on a learned linear projection, $F_{ij} x_i = F_{ji} x_j$. Gradient descent on $\|F_{ij} x_i - F_{ji} x_j\|^2$ yields a message passing update:

$$x_i \leftarrow x_i - \eta\, F_{ij}^\top (F_{ij} x_i - F_{ji} x_j)$$

for each agent. Importantly, this update requires only knowledge of local state and pairwise prediction error ($F_{ij}x_i-F_{ji}x_j$) of adjacent agents. When cast across hundreds of agents with local connectivity, this amounts to a decentralized algorithm — sheaf diffusion — that iteratively updates each agent's private state vector to reach global consensus (defined as pairwise $F_{ij}x_i=F_{ji}x_j$ for all communicating agents).

Sheaf-ADMM

The Sheaf-ADMM architecture. A shared encoder maps each local view to a small convex problem; the ADMM layer alternates local optimization (x ), consensus via sheaf diffusion (z ), and dual accumulation (u ) for $K$ rounds; a shared decoder turns the final states into local predictions, fused into the global answer. Every step is differentiable.

Sheaf...

Learning Multi-Agent Coordination via Sheaf-ADMM

Related Articles

(no title)

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

ZCode – Harness for GLM-5.2

Apertus – Open Foundation Model for Sovereign AI