Show HN: A Reverse Captcha for Clankers

rydgel1 pts0 comments

Clanker CAPTCHA Demo

Live challenge

Playable widget<br>The widget here is the exact library a host page would drop in. This<br>page only hands it a mount point and a couple of endpoint URLs.

Server-kept secret<br>The answer never reaches the browser. All the widget gets are the<br>image frames, some public solve parameters, and a challenge id that<br>expires within seconds.

Library-owned metadata<br>The widget injects clanker-agent-task, hidden<br>instructions, current challenge data attributes, and an<br>application/clanker+json manifest.

Why reverse the usual CAPTCHA shape?

The usual CAPTCHA looks for something people find easy and<br>software finds hard. That gap has been closing for years.<br>Clanker CAPTCHA flips it around: the task is a pain to do by<br>hand against a timer, but simple for an agent that can read<br>pixels and do a little math.

None of the method is hidden. The challenge is spelled out for<br>any agent willing to play along, the answer stays on the server,<br>and a solver still has to do the work to find it.

This is a research demo, not a security product. Think of it as a<br>rough sketch of how agent-facing verification might work, not<br>something you would put in front of real abuse.

No hidden answer<br>The browser sees the instructions, the images, and the public parameters. It never sees the checksum.

Agent readable<br>The widget drops in machine-readable metadata and a JSON manifest describing the challenge.

Pixel grounded<br>You are meant to solve it from the rendered frames, not by scraping a value out of the DOM.

Why would a CAPTCHA for agents be useful?

This kind of CAPTCHA does not care whether you are human. What it<br>cares about is whether automated access happened in the open, tied<br>to a live challenge you can measure. That helps when you already<br>expect capable agents to turn up and would rather hand them a clear<br>protocol than treat them as malfunctioning people.

Forget "human or bot." The real question is whether this caller did<br>the requested work for a fresh challenge, followed the public rules,<br>and did it before reaching for the protected action. Whatever signal<br>that produces, a host can weigh it against the usual things: account<br>age, rate limits, reputation, payment status.

Cooperative agents<br>Gives agents a documented way to show they can read the page and respect site policy.

Cost shaping<br>Makes throwaway automation burn real compute on fresh per-session evidence instead of replaying a token.

Audit trail<br>Publishes a structured manifest, so the solve path is easy to inspect when something breaks.

In practice you would put it in front of the expensive actions:<br>creating accounts, hammering a sensitive endpoint, retrying<br>checkout, minting API keys. It will not replace authorization. It<br>just adds a little friction and some evidence, built with browser<br>agents in mind instead of aimed at them.

Why make it hard for a human?

Sometimes the lane you are protecting is meant for software, not<br>hands on a keyboard: agent APIs, automation consoles, bulk jobs,<br>crawler deals. A puzzle a person can solve is the wrong fit there.<br>It just invites people to solve it by hand, pay someone a few cents<br>to click, or screenshot it and pass it along.

Making it hostile to humans on purpose is a way of saying who the<br>lane is for: an accountable agent that reads the pixels and follows<br>the manifest. That is worth doing when you want human flows and<br>machine flows kept apart, rather than jammed behind one checkbox.

Do not use this to lock people out of something they actually need.<br>If a flow is for humans, give humans a way through. The hostile<br>version is for agent-only gates, research demos, and controlled<br>automation surfaces where stopping manual solves is the whole idea.

What signal does the host get?

A normal checkbox really only tells you "something got clicked."<br>This aims for something with more in it: a specific browser session<br>pulled a fresh challenge, showed its manifest, ran the computation,<br>and sent back the checksum and nonce before the clock ran out.

On its own it is not identity, just one input into a bigger<br>decision. A host can pair it with session age, account trust,<br>request velocity, IP reputation, whatever crawler policy it has.

Freshness<br>Each challenge expires and is recorded once on the server, so a stale solve is worthless as a reusable credential.

Page-state awareness<br>The intended solver must inspect rendered frames and the manifest produced by this widget instance.

Compute evidence<br>The checksum requires spectral fusion and the submit body includes a proof-of-work nonce.

Debuggable contract<br>The hidden instructions and JSON manifest make failures explainable for compliant agents and maintainers.

The challenge tricks

It looks chaotic on screen, but the real puzzle is in the frequency<br>domain. Every frame carries the genuine signal, some per-frame<br>decoys, and a layer of noise that is just for show. You have to fuse<br>the frames before you trust whichever peak looks strongest.

Fused frames<br>Every image contributes...

challenge agent captcha manifest widget solve

Related Articles