WebGPU Cluster — Distributed WebGPU inference
Distributed GPU network
AI on the grid,
powered by your WebGPU.
Turn any browser with WebGPU into a cluster node. Share inference for LLM models — Host a model on your powerful workstation and access it securely from your phone, laptop, or let others connect to it.
Join the grid<br>View available grid
Powered by WebGPU & Transformers.js · No GPU<br>drivers to install · Open HTTP API
Hosts online
Hosts registered
Why join<br>Browser-native GPU sharing
Contribute spare GPU cycles from your workstation. Clients send images over HTTP;<br>your browser runs the model and returns results — privately, on your hardware.
🖥️
Instant local hosting
Open a tab, pick a model, and start hosting. RF-DETR and SmolVLM load in a Web<br>Worker on WebGPU — no Python environment or driver setup.
🔒
Your data stays local
Inference runs on your GPU in the browser. Images are processed on your machine;<br>nothing is sent to third-party AI APIs.
🔌
Universal HTTP API
Connect from curl, Python, Node, or any HTTP client. Simple JSON endpoints for<br>detection and image description — queue and broker included.
Architecture<br>How the grid works
A lightweight Node broker coordinates tasks. Browser hosts stay connected via SSE<br>and pull jobs when idle.
Client<br>curl / Python / app
Broker<br>task queue · /v1/detect
Browser host<br>WebGPU inference
Response<br>boxes · labels · text
Register your node
Open the host page, choose a host id and model, then click Start hosting. Keep<br>the tab open while you share GPU time.
Jobs arrive via SSE
The broker forwards detection and description tasks to your browser. One job<br>runs at a time per host.
Anyone can call the API
Point clients at POST /v1/detect or /v1/describe with<br>your host id. Results return as JSON.
API<br>Call the grid from anywhere
Use the cluster monitor to see online hosts and copy ready-made curl examples.
detect.sh<br>POST /v1/detect
curl -X POST 'http://localhost:5180/v1/detect' \<br>-H 'Content-Type: application/json' \<br>-d '{<br>"host": "my-gpu-node",<br>"image_url": "https://example.com/photo.jpg",<br>"threshold": 0.5<br>}'
Models<br>What you can host today
Models download from Hugging Face on first load. Pick one per host session.
Detection<br>RF-DETR Medium
Real-time object detection (COCO) via ONNX on WebGPU. Endpoint:<br>POST /v1/detect
Vision-language<br>SmolVLM-500M
Describe images with a compact VLM on WebGPU. Endpoint:<br>POST /v1/describe
Ready to power the grid?
Share your GPU or explore nodes already online.
Join the grid<br>View available grid