Code as Agent Harness
Preprint 2026 · Survey
Code as Agent Harness
Toward Executable, Verifiable, and Stateful Agent Systems
A code-centered view of agentic AI: code is not only generated output, but the<br>operational substrate for reasoning, acting, environment modeling, execution<br>feedback, and multi-agent coordination.
Paper<br>arXiv<br>GitHub<br>Cite
agent-harness.sh
Xuying Ning1†, Katherine Tieu1†, Dongqi Fu2†,<br>Tianxin Wei1†, Zihao Li1†, Yuanchen Bei1†,<br>Jiaru Zou3, Mengting Ai1, Zhining Liu1,<br>Ting-Wei Li1, Lingjie Chen1, Yanjun Zhao1,<br>Ke Yang1, Bingxuan Li1, Cheng Qian1,<br>Gaotang Li1, Xiao Lin1, Zhichen Zeng1,<br>Ruizhong Qiu1, Sirui Chen1, Yifan Sun1,<br>Xiyuan Yang1, Ruida Wang1, Rui Pan1,<br>Chenyuan Yang1, Dylan Zhang1, Liri Fang1,<br>Zikun Cui2, Yang Cao2, Pan Chen2,<br>Dorothy Sun2, Ren Chen2, Mahesh Srinivasan2,<br>Nipun Mathur2, Yinglong Xia2, Hong Li2,<br>Hong Yan2, Pan Lu3, Lingming Zhang1,<br>Tong Zhang1, Hanghang Tong1§, Jingrui He1§
1University of Illinois Urbana-Champaign · 2Meta ·<br>3Stanford University · †Core Contributor ·<br>§Corresponding Author
Connected Layers
6+<br>Application Areas
102<br>PDF Pages
450+<br>Cited Work
Abstract
Code becomes the runtime medium for agents.
Recent LLMs have become strong code generators, but emerging agentic systems<br>use code for more than final answers. This survey frames code as an agent<br>harness: a unified infrastructure layer for agent reasoning, action,<br>environment modeling, feedback-driven control, and verification. It studies<br>how code connects agents to executable steps, durable state, reusable tools,<br>tests, traces, repositories, and multi-agent workflows.
Taxonomy
Three Layers of Code as Harness
01<br>Harness Interface
Code connects agents to reasoning, action, and environment modeling:<br>executable reasoning traces, programmable actions, DOM/API interfaces,<br>simulators, tests, and state representations.
Reasoning substrate
Action interface
Environment representation
02<br>Harness Mechanisms
Planning, memory, tool use, control, and optimization sustain agents over<br>long-horizon execution. Failures become feedback for repair rather than<br>dead ends.
Planning and decomposition
Working and long-term memory
Tests, traces, and static analysis
03<br>Scaling the Harness
Shared code artifacts allow multiple agents to coordinate, review, test,<br>debate, red-team, and verify progress inside a common repository or<br>workflow state.
Manager, planner, coder, reviewer, tester roles
Centralized and distributed workflows
Shared state and collective verification
Applications
Where the harness shows up
Coding Assistants<br>GUI / OS Agents<br>Embodied Agents<br>Scientific Discovery<br>Personalization<br>Recommendation<br>DevOps<br>Enterprise Workflows
Open Problems
Harness engineering is the hard part.
Evaluation beyond final success
Intermediate states, traces, repair attempts, and safety checks need first-class metrics.
Verification with incomplete feedback
Agents must act under partial tests, noisy execution signals, and hidden environment state.
Regression-free improvement
Harnesses should learn from failure without silently breaking previously working behavior.
Shared state across agents
Coordination depends on durable memory, repository state, review artifacts, and permissions.
Citation
BibTeX
Copy<br>@misc{ning2026codeagentharness,<br>title = {Code as Agent Harness: Toward Executable, Verifiable, and Stateful Agent Systems},<br>author = {Xuying Ning and Katherine Tieu and Dongqi Fu and Tianxin Wei and Zihao Li and Yuanchen Bei and Jiaru Zou and Mengting Ai and Zhining Liu and Ting-Wei Li and Lingjie Chen and Yanjun Zhao and Ke Yang and Bingxuan Li and Cheng Qian and Gaotang Li and Xiao Lin and Zhichen Zeng and Ruizhong Qiu and Sirui Chen and Yifan Sun and Xiyuan Yang and Ruida Wang and Rui Pan and Chenyuan Yang and Dylan Zhang and Liri Fang and Zikun Cui and Yang Cao and Pan Chen and Dorothy Sun and Ren Chen and Mahesh Srinivasan and Nipun Mathur and Yinglong Xia and Hong Li and Hong Yan and Pan Lu and Lingming Zhang and Tong Zhang and Hanghang Tong and Jingrui He},<br>year = {2026},<br>eprint = {2605.18747},<br>archivePrefix = {arXiv},<br>primaryClass = {cs.CL},<br>url = {https://arxiv.org/abs/2605.18747},