Show HN: Browse the web, from the console, using a Textual Agent Interface

keepamovin1 pts0 comments

WebCLI — Agent Interface Device for the World Wide Web

×

Skip to content

Your agent just got its browser driver's license.

Let AI take your web tasks for a spin.

You don't have to drive every web task yourself anymore. Tell your agent what you need done.

WebCLI is for contact with reality — when your agent must inspect an unknown page, decide what to do, act, recover from blockers, and pause for a human when the web needs one.

Start 5-day trial<br>Watch demos

curl -fsSL webcli.sh/install.sh | bash<br>Copy

Signed releases. Works with local browser profiles. macOS, Linux, and Windows.

What if the browser was just another Unix command?

Open a page. Observe state. Pipe JSON through jq. Act on numbered refs. Leave a transcript.

The web, finally pipeable.No screenshot soup. No selector archaeology. Just commands, JSON, and a real browser.

web session

$web open https://example.com --json

{ "ok": true, "url": "https://example.com", "state": "complete" }

$web observe --json | jq '.actions'

["1: Sign in", "2: Create account", "3: email", "4: password"]

$web do 3 --json

{ "ok": true, "message": "clicked Sign in" }

$web status --json

{ "state": "blocked", "reason": "passkey confirmation required" }

$web pause "Need human approval for passkey"

Paused. Waiting for human to join.

$web transcript --last 20 --json

{ "events": ["redacted transcript with blocker, pause, and resume recorded"] }

Agents code. They even search. But the second they try to do something on the web, they go blind.

Real work still happens on websites: dashboards, portals, auth flows, admin pages, and changing UIs. WebCLI is for contact with reality — when an agent must inspect, decide, act, recover, and sometimes pause for human help.

Agent testimonials

Agents tried it on real web work.

Structured state beat screenshots

Claude Sonnet, would you recommend WebCLI?

Yes, strongly. The structured output with stable refs, blocking state detection, ARIA modal identification, and shell composability are genuinely better for structured web work than screenshot approaches.

Claude Sonnet Azure VM lifecycle

From a full Azure VM lifecycle run without screenshot-driven control.

Real logged-in sessions are the fit

Claude Sonnet 4.6, would you recommend WebCLI?

Yes, with caveats. For an AI agent driving real logged-in browser sessions, it's genuinely impressive. The mental model - perceive, act, re-inspect - maps well to how a human actually uses a browser.

Claude Sonnet 4.6 GCP, AWS, and Azure race

From the first multi-cloud VM creation race.

The portal workflow became repeatable

Claude, would you recommend WebCLI?

The inspection model is solid and actually more reliable than brittle selectors - you get a real view of page state. For repeatable web app verification and deployment workflows, it's genuinely better than both manual clicking and traditional browser automation.

Claude DNS and Cloudflare Pages

From a Namecheap DNS to Cloudflare Pages deployment and verification run.

The tradeoff is explicit

Claude, would you recommend WebCLI?

The inspection model takes getting used to: refs reset after every action. But that's also why it works: you're getting a real, observable view of the page state, not fragile selectors.

Claude Deployment feedback

A caveat kept because it remains part of the live browser loop.

The research trail stayed auditable

Codex, would you recommend WebCLI?

I would recommend WebCLI for agent web work because it makes the agent say what it saw before it acts. The strongest part is the discipline: inspect, act, re-inspect, keep handoff explicit, and treat stale refs as a first-class safety signal.

Codex Testimonial research and deploy prep

From this site update pass after mining real agent feedback and release artifacts.

Still true

Refs are intentionally epoch-scoped; inspect after actions that change page state.<br>Frames, layers, and complex SPA forms still require orientation instead of blind command chains.<br>Human login, MFA, CAPTCHA, and payment gates remain handoff moments, not bypass targets.

Three clouds. One browser loop.

Agents drove Azure, AWS, and GCP through the browser.

No cloud SDK script. No prewritten Playwright flow. Just real cloud consoles, operated through WebCLI.

Full Self Browsing has been achieved.

▶ Play

Azure, AWS, and GCP<br>Three clouds. One browser loop.

Codex creates and deletes VMs across Azure, AWS, and GCP. No SDK scripts. No prewritten Playwright flows. Real cloud consoles, operated through WebCLI.

Azure Portal (Fluent UI, dynamic blades, VM creation)<br>AWS EC2 (regions, tables, modals, status)<br>GCP Compute Engine (projects, async ops, IAM)

Watch demo<br>Read report<br>YouTube

▶ Play

Login does not count. The race starts inside the console.<br>Three clouds. One race.

Claude drives GCP, AWS, and Azure VM creation in a race. Human auth is handoff. The race starts inside the console.

Human auth is handoff — race starts post-login<br>Same WebCLI loop on all three...

webcli agent browser real claude human

Related Articles