Agent Draw: An agent draws while you talk, built on TLDraw

Agent Draw: An agent draws while you talk, built on TLDraw | Tech Stackups

Skip to main content We recently built 2draw, a Drawful-style game where players draw on a shared canvas and race to guess each other's drawings, on tldraw, an infinite-canvas SDK for React.

We started wondering what it would take to put an agent in that loop, as an opponent or a rival guesser, and dug into how an agent could read and draw on a tldraw canvas. That research turned into:

Agent draw , a tool that lets an agent draw to the canvas for you while you present.

Here you can see the agent assisting me in my demo presentation of a third grade chemistry class.

You can try it right now, or grab the source:

Live demo: tldraw-agent-draw-demo.james-664.workers.dev

Source: github.com/ritza-co/tldraw-agent-draw-demo

Agent Draw is an agent that draws while you present

Drag a rectangle on the canvas, say what you want inside it, and by the time you look back it's there, drawn by an AI agent while you kept talking. Drag a few rectangles in a row and they queue up, each drawing in turn. All of this happens on an infinite canvas tool called tldraw.

What is tldraw?

tldraw is an infinite-canvas SDK for React: the same editor API a user drives with a mouse, an agent can drive in code, creating shapes, moving them, drawing arrows between them.

We did not build our agent from scratch. tldraw already publishes an official Agent starter kit that draws and arranges shapes through a chat panel, backed by a Cloudflare Worker. We built Agent Draw on top of that.

How good is an agent at drawing?

A simple composition built on top of tldraw's primitives, rectangles, diamonds, arrows, is something most models handle well. A more intelligent model tends to get more ambitious with the composition, and is better at placement on the canvas. We tested this by giving each model the same two requests in one session: draw a decision diagram, and draw a person playing cricket.

claude-opus-4.8 handled both well, a clean decision diagram built from primitives, and, more interestingly, a fully realised cricket scene sketched with the pen tool:

Where it struggles

That was claude-opus-4.8, one of the more capable models available. The result looks different with a smaller model behind the same requests.

A smaller model settles for less

Give the same two requests to a smaller model, and the ambition drops off fast. claude-haiku-4.5 matches Opus on the decision diagram, but for the cricket request it stays with primitives instead of reaching for the pen, and settles for a simpler, static composition with a label rather than a dynamic sketch:

A weaker model can give up on the task

Drop down further to google/gemini-2.5-flash-lite, and it seems to give up early on both requests:

How we made agent draw

The whole feature is a new canvas tool, a speech pipeline, a serialized draw queue, and a prompt section, and here is each piece, with the actual code from the repo.

Capturing the region you draw

tldraw tools are state machines. You subclass StateNode and define child states; tldraw routes pointer events to whichever state is active. Our AreaCaptureTool has three states (idle → pointing → dragging) and does its real work on pointer-up, when the dragged rectangle is final:

class AreaCaptureDragging extends StateNode { static override id = 'dragging'

private bounds: BoxModel | undefined = undefined

override onPointerUp() { this.editor.updateInstanceState({ brush: null }) if (!this.bounds) throw new Error('Bounds not set') // Hand the captured rectangle (in page coordinates) to the capture session. startCaptureSession(this.bounds) this.parent.transition('idle')

updateBounds() { if (!this.initialPagePoint) return const currentPagePoint = this.editor.inputs.getCurrentPagePoint() const x = Math.min(this.initialPagePoint.x, currentPagePoint.x) const y = Math.min(this.initialPagePoint.y, currentPagePoint.y) const w = Math.abs(currentPagePoint.x - this.initialPagePoint.x) const h = Math.abs(currentPagePoint.y - this.initialPagePoint.y) // Show tldraw's native selection brush while dragging. this.editor.updateInstanceState({ brush: { x, y, w, h } }) this.bounds = { x, y, w, h }

We get the live selection-brush rectangle for free by writing to editor.updateInstanceState({ brush }), the same instance state tldraw's own select tool uses. The bounds are in page coordinates, so they stay correct no matter how the user has panned or zoomed.

Listening while you talk

The moment a capture starts, we open the mic. AreaRecorder is a thin wrapper over the browser's MediaRecorder, deliberately with no knowledge of the agent or transcription, just start() and stop():

export class AreaRecorder { async start(): Promisevoid> { this.stream = await navigator.mediaDevices.getUserMedia({ audio: true }) const recorder = new MediaRecorder(this.stream, { mimeType: this.mimeType }) this.chunks = [] recorder.ondataavailable = (event) => { if (event.data.size > 0)...

Agent Draw: An agent draws while you talk, built on TLDraw

Related Articles

(no title)

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

ZCode – Harness for GLM-5.2

Apertus – Open Foundation Model for Sovereign AI