CRDTs Are Not Enough When Your Coworker Is an AI Agent | ProductNow<br>All posts
CRDTs Are Not Enough When Your Coworker Is an AI Agent<br>byKadhir Mani<br>(6.5 minutes)
ProblemCRDTs converge; they do not coordinate. That distinction becomes critical the moment an AI agent joins the editor.We hit this while building a multiplayer editor where humans and agents could both modify the same document. A CRDT guarantees replicas reach the same state, but it says nothing about whether the agent should write right now, when a human is actively in the same section, or where its output lands given stale context.Human-only multiplayer assumes every keystroke reflects fresh, local intent. An agent breaks that assumption: it operates on a snapshot taken before inference begins, may rewrite large spans at once, and has no inherent presence awareness.<br>Why the Naive Solution FailsThe naive solution is to treat the agent as just another peer on the CRDT graph, a faster user. Let the CRDT merge concurrent edits and converge on a consistent state. Problem solved.But it ended up not being that simple. CRDTs guarantee convergence, not that an edit should have been made in the first place. In practice, the result was agents repeatedly overwriting sections that humans were actively editing. With several people in the same document, each with their own agent running concurrently, the collisions compounded.Agents exposed a gap in three specific ways:Stale context. The agent reads at T₀, infers, then writes at T₁. Human edits made between those moments are invisible to its prompt. The CRDT merges output that was reasoned against a state that no longer exists.Large-span rewrites. Humans edit words; agents rewrite paragraphs. A wider edit span raises collision probability and can silently invalidate comment anchors and structured blocks.No presence awareness. Human collaborators spot a cursor in a section and adjust their own editing behavior accordingly. Agents have no equivalent signal. Without it, an agent writes into an actively-edited section, and the CRDT faithfully merges the result.<br>Practical Solution ShapeWe went down a different route. We added a coordination layer that gates all writes to the shared state and persistence for non-human participants.<br>flowchart TD<br>H[Human Editor] --> C["Coordination Layer(Presence, Locks, Approvals)"]<br>A[AI Agent] --> C<br>R[Reviewer] --> C<br>C --> S[Collaborative Document State]<br>S --> P[Durable Persistence]
The coordination layer sits above CRDT convergence and answers three questions before the agent writes:Is the target section free?Is a human actively focused there?Does the agent have a current snapshot of document state?After ensuring the agent has been cleared, it takes a narrow, expiring section-level lease, not a whole-document lock, acquired against a stable snapshot. In other words, once the agent acquires a specific section lock, a human can no longer edit that section to avoid unpredictable collisions. A human can, however, override and kick an agent out at any time.This system allows many humans and agents to edit the shared canvas with generally fewer conflicts. Edits can be parallelized better.Four invariants keep the system safe:No stale writes — version is checked at lease time; a mismatch forces a re-read.No unbounded locks — leases carry a TTL and auto-expire on crash or stall.Approval on conflict — if a human is actively in the target section, the agent’s write is gated on an explicit approval signal before the lease is granted.No live-state/storage coupling — reconnect reconciles against server state, not a local event replay.The trade-off is added latency and protocol surface area in exchange for safety and a more predictable user experience. TTLs must be calibrated to inference latency. Aka, too short and agents false-expire, but too long and a stalled agent blocks the section. Large rewrites that span anchored comments need a coherence pass after publishing; lease scope alone does not protect anchor references.<br>flowchart LR<br>N1(Request Edit) --> N2(Check Presence/Focus)<br>N2 --> N3{Human Active?}<br>N3 -->|Yes| N4(Request Approval)<br>N4 -->|Approved| N5(Acquire Expiring Lease)<br>N3 -->|No| N5<br>N5 --> N6(Edit Stable Snapshot)<br>N6 --> N7(Publish Collaborative Event)<br>N7 --> N8(Release Lock)<br>classDef p fill:#3E63DD,stroke:#263c85,color:#fff<br>classDef d fill:#F59E0B,stroke:#b47408,color:#fff<br>class N1,N2,N4,N5,N6,N7,N8 p<br>class N3 d
A Small ProtocolA minimal protocol for agent edits can be sketched without reference to any specific framework. The core steps are always the same:def agent_edit(section_id, base_version): state = read_presence(section_id) if state.human_active: approval = request_approval(section_id, ttl=45s) if not approval.granted: return conflict lease = acquire_lease(section_id, base_version, ttl=90s) if not lease.ok: return retry_with_fresh_snapshot snapshot = read_section(section_id) update = generate_edit(snapshot)...