Stop Asking "Saga or 2PC." Start Asking What Each Operation Needs

feeblefakie2 pts0 comments

Stop Asking “Saga or 2PC.” Start Asking What Each Operation Needs. | by Hiroyuki Yamada | Jul, 2026 | MediumSitemapOpen in appSign up<br>Sign in

Medium Logo

Get app<br>Write

Search

Sign up<br>Sign in

Stop Asking “Saga or 2PC.” Start Asking What Each Operation Needs.

Hiroyuki Yamada

4 min read·<br>Just now

Listen

Share

Press enter or click to view image in full size

Part 3 of “Architecture in the AI Era.” (Part 2: why keeping data consistent across microservices stops being free — and why the usual Saga-vs-2PC advice gets both sides wrong.)<br>In the last piece I argued that the two standard ways to keep data consistent across services — Saga and two-phase commit — are usually framed as rivals, and that the framing gets both wrong. Saga is oversold by waving away its loss of isolation. Two-phase commit is dismissed on the strength of one aging implementation.<br>If both are misjudged, then “which one should we use” is the wrong argument to be having. The better question is narrower: what does this operation actually require?<br>Decide per operation, not per system<br>Most teams pick one approach and apply it everywhere. That’s the mistake. A real system doesn’t have one kind of operation — it has several, with different requirements.<br>Some operations need the strong guarantee. Moving money between accounts, committing inventory against an order, anything where another transaction must not see a half-finished state — these need atomicity and isolation across services. That’s the territory of two-phase coordination.<br>Others can’t have it, or shouldn’t pay for it. A workflow that runs for minutes, waits on a human approval, or calls an external system you don’t control can’t hold a transaction open across all of it. There, a saga — local steps with compensations — is the right tool, and the loss of isolation is a deliberate, acceptable trade.<br>The choice isn’t ideological. It’s per operation: match the mechanism to what the work needs.<br>The answer is usually “both”<br>Follow that logic and you stop choosing between Saga and 2PC at the system level — because a single system, sometimes a single workflow, needs both.<br>Picture an order. Reserving inventory and capturing payment is the part that has to be all-or-nothing and isolated: two-phase commit. Around it sits a longer flow — fraud checks, shipment scheduling, notifications, an external payment provider — that runs over time and reaches outside your walls: a saga. A strong-consistency core, wrapped in a longer, looser orchestration. They aren’t alternatives. They compose.<br>Which is why having only one of them in your toolkit is a real constraint. If all you have is Saga, you give up isolation even on the operations that genuinely need it, and you patch over the gap with application code. If all you have is 2PC, you can’t model the long-running, outward-facing flows that make up much of real business logic. The useful position is to have both available on common ground, and to apply each where it fits.<br>Which means putting 2PC back on the table<br>The reason teams don’t work this way today is the belief that two-phase commit is off-limits — that it’s locked to specific databases, fragile, and slow. As I argued last time, that reputation belongs to XA, a specific decades-old implementation, not to two-phase coordination as an idea.<br>Modern implementations of the same idea reconcile what XA couldn’t. They keep strong consistency across services without depending on database-specific transaction features, so they aren’t confined to a single vendor’s database. They manage the coordinator’s state through a consensus-based, replicated mechanism, so a single failed process is no longer a single point of failure. And distributed transaction technology has come a long way in thirty years: a wide range of optimizations, from parallelizing coordination to protocol-level improvements, are available now that simply weren’t part of the XA-era picture. The protocol was never the real bottleneck.<br>Once 2PC is a usable tool again rather than a cautionary tale, “pick the right one per operation, and compose them” becomes something you can actually do — instead of reaching for Saga everywhere because the alternative was ruled out years ago.<br>The question that’s left<br>Deciding per operation, and composing Saga and 2PC, settles how you keep data consistent. It leaves a different question untouched: where that machinery should live.<br>Today, most of it lives in application code — the retries, the compensations, the careful ordering, the reasoning about what’s safe to read and when. That’s a great deal of subtle, easy-to-get-wrong logic to carry in the application layer. And it’s precisely the layer that, as I argued in Part 1, AI is increasingly writing — without reasoning about any of these guarantees on your behalf.<br>So the real question isn’t only which mechanism per operation. It’s whether keeping data correct should be the application’s job at all.<br>That’s what I’ll take on to close the series.<br>If you run a...

saga operation asking phase needs part

Related Articles