AI Can now control your desktop

AmDab1 pts0 comments

Clawd Cursor v1.0.0 — the local MCP server for safe desktop control

v1.0.0 — latest stable

A cursor and a keyboard<br>for any AI agent

Any model. Any app. One MCP entry. Local-only. 6 compact tools, single safety chokepoint, no telemetry. Cheap paths first — accessibility before pixels, vision only as a last resort.

View on GitHub

Quick Start

Toolbox — compound tools (recommended)

94<br>Tools — granular (compat / debug)

Operating Systems

Two ways to use it

Run it yourself, or hand it to your agent.

Test from the CLI

Plain English in, actions out.

clawdcursor doctor<br>clawdcursor agent

Wire it into your agent

One MCP entry, desktop control appears as native tools.

Claude Code<br>Cursor<br>Windsurf<br>OpenClaw<br>Zed

Pick a mode

How will your AI talk to it?

Same tools, three entry shapes. Pick once during install.

clawdcursor mcp — recommended

AI lives in your editor (Claude Code, Cursor, Windsurf, Zed). Editor spawns clawdcursor on demand over stdio. No daemon, no port.

"mcpServers": {<br>"clawdcursor": {<br>"command": "clawdcursor",<br>"args": ["mcp", "--compact"]

6 / 94<br>Compact / Granular tools

stdio<br>Transport

clawdcursor agent — autonomous daemon

clawdcursor brings its own LLM brain (configured via doctor). For unattended runs, scheduled tasks, multi-process orchestration.

Run clawdcursor doctor &middot; pick a provider

Run clawdcursor agent

POST tasks to 127.0.0.1:3847/mcp

:3847<br>HTTP MCP

13+<br>Providers

clawdcursor agent --no-llm — BYO brain

Your agent already has a brain — you just want HTTP tools. Same daemon, no built-in agent loop.

Run clawdcursor agent --no-llm

94 tools on :3847/mcp

Stateless — no session init needed

94<br>Granular tools (compat)

any<br>HTTP client

How it works

Cheap paths first.

A11y tree before pixels. Vision only when needed.

1 Accessibility

Zero pixels<br>Read the a11y tree, act on element names. No screenshot, no vision LLM.

2 Escalate as needed

Cheapest rung that works<br>OCR when the tree is sparse, screenshot when you need pixels, vision only for canvas UIs.

3 Safety

Single chokepoint<br>Every tool call gates through one safety layer. Destructive actions need confirmation.

🎯

Toolbox — 6 compound tools

The recommended surface — computer, accessibility, window, system, browser, task. ~12× smaller catalog than the granular Tools surface.

🧩

One adapter per OS

Windows, macOS, Linux behind a single interface. Linux covers X11 and Wayland.

Features

Any OS. Any model.

🍎

macOS

TCC-safe. clawdcursor grant handles Accessibility + Screen Recording.

🪟

Windows

Native UIA + Windows.Media.Ocr. x64 and ARM64.

🐧

Linux

X11 and Wayland. AT-SPI for a11y, Tesseract for OCR.

🖱️

Smart tools

Click by name, type by label, read screen. A11y first, OCR as fallback.

⌨️

Shortcuts engine

Platform-aware key combos — Cmd on macOS, Ctrl elsewhere. No LLM cost.

📦

batch — one round-trip

Collapse N deterministic tool calls into a single guarded, safety-gated batch. N calls &rarr; 1.

Tools<br>6 compact tools + 94 granular ›<br>The 6 compact compounds are the recommended public surface. Each row lists the actions you pass via { "action": "…" }. The 94 granular tools (one schema per verb) are listed below for compatibility and debugging — use them when your runtime requires every primitive as a top-level MCP tool. (94 total.)

Compound<br>Purpose<br>Actions

computer<br>Mouse, keyboard, screenshots. The raw I/O surface.<br>screenshot · click · double_click · right_click · triple_click · hover · scroll · scroll_horizontal · drag · drag_path · type · key · wait

accessibility<br>Drive UI by element name, not by pixel. Survives DPI, resize, layout shifts.<br>read_tree · find · get_element · focused · invoke · focus · set_value · get_value · expand · collapse · toggle · select · state · list_children · wait_for

window<br>Launch, focus, resize. App-level state management.<br>list · active · focus · maximize · minimize · restore · close · resize · list_displays · screen_size · open_app · open_file · open_url · switch_tab · navigate

system<br>Clipboard, OCR, shortcuts, undo, webview detection + CDP relaunch, the active system prompt, and task delegation. The meta surface for an external brain.<br>clipboard_read · clipboard_write · system_time · ocr · undo · shortcuts_list · shortcuts_run · delegate · detect_webview · relaunch_with_cdp · system_prompt

browser<br>Chrome DevTools Protocol — real DOM access for Electron / WebView2 apps whose a11y tree is sparse.<br>connect · page_context · read_text · click · type · select_option · evaluate · wait_for · list_tabs · switch_tab · scroll

task<br>Hand off the whole task to clawdcursor's autonomous loop. Daemon mode only — requires clawdcursor agent with an LLM configured.<br>single arg: { instruction: string } — no action enum

Compact form (recommended): computer({ "action": "key", "combo": "mod+s" }) — ~1,500 tokens of catalog.<br>Granular form (compat / debug): key_press({ "key": "mod+s" }) — 94 individual tools, one schema per verb.<br>Both produce identical effects through the same...

tools clawdcursor agent granular compact single

Related Articles