Agentic-Agile: Why Agent Development Needs Agile (Not Just Prompts) - Microsoft for Developers
Skip to main content
Search<br>Search
No results
Cancel
Daniel Epstein
"A bad system will beat a good person [or agent] every time" ~Dr. William Edwards Deming (with apologies)
I started vibe coding by writing prompts (often dictated into my phone), refining them with an agent in M365 Copilot, and creating handoff files to use with GitHub Copilot CLI. The results were predictably non-deterministic.
Prompt-driven development is a typical starting pattern: a developer opens a chat session, writes a prompt, reviews the output, adjusts, re-prompts. Maybe they get something useful. Maybe they spend an afternoon debugging emergent behavior that nobody specified and nobody tested. Then the process evolves to Spec-Driven Development: the developer creates specifications defining the "why and the "what." They institute constraints and validation criteria, and the agent delivers more consistent code requiring less debugging.
But how do we scale this to teams of many humans and agents working in parallel? How do we persist development over large projects and codebases that exhaust most of the agent context window during initial grounding? How do we get better over time?
Several of my colleagues and I have started talking about a model we’re calling Agentic-Agile development as one methodology to address these problems.
Agentic-Agile Methodology: Wait! Hear me out!
I’m fortunate in my role at Microsoft as a Partner Tech Strategist (PTS) to work in a global team managing joint product and co-innovation with our leading Data & AI and partners. We come with a combination of engineering and product management backgrounds and spend most of our time focused on getting teams working better together across organizations. Adding agents to the development team, while ensuring continuous improvement, is proving a natural extension of our role.
The original Agile Manifesto taught us to value individuals and interactions, working software, customer collaboration, and responding to change. It was so successful that for some it became dysfunctional dogma (see The Death of Agile: Why Big Tech Is Ditching Scrum and What They Use Instead | by Ibrahim Irfan | Medium).
But Agile and Scrum were designed for maintaining team velocity while maintaining alignment in working toward rapidly shifting business goals. Today’s problems are similar but with agents, the time scale and the makeup of the team have changed. Our processes need to be flexible to evolve and maintain that alignment with agents as part of the team. That doesn’t mean we need to abandon processes.
This article is the introduction to a longer series detailing my and some colleagues’ practices that we hope will be helpful to others on the same path. I’ll often reference some of my personal projects, like Minthe which started as an attempt to build a chief of staff agent in Microsoft Foundry. Minthe has become my larger experiment in Agentic-Agile development and also has helped to bootstrap several other projects.
This isn’t a finished process, and we would really like your feedback and participation to continue to improve the framework.
Read the full version of Toward an Agentic-Agile Manifesto
Explore the Agentic-Agile Template | GitHub
A roadmap view of Minthe (time not to scale)
The Problem: Development Without Process
Prompt-driven development works for small, self-contained tasks: Generate a function. Refactor a module. Write a test. These are bounded problems with clear outputs that modern AI coding agents handle well. Spec-Driven Development expands the scope of the tasks that can be delivered but it doesn’t scale over time without careful grooming and maintenance.
The breakdown happens when scope grows. A multi-module system. An integration layer with external dependencies. A feature that spans files, schemas, and behavioral contracts. At that scale, prompt-driven development produces a set of familiar failures:
No backlog: There is no structured list of what needs to be built, in what order, with what dependencies. Work gets discovered during implementation, not planned before it.
No concept of done: Each prompt session ends when the developer feels satisfied, not when a contract is fulfilled. "Good enough" replaces "contract satisfied."
No phased delivery: Everything is attempted at once. There is no staged rollout, no incremental validation, no ability to pause and redirect.
No governance: Safety constraints, validation rules, and quality gates are bolted on after the fact, if they are added at all.
The result is predictable. Agents produce code that works in isolation but breaks under integration. Behavior drifts across sessions because there is no shared state defining expected behavior, or because durable memory systems contain stale and conflicting information. Defects escape into production because there was no structured review gate to...