Imece – Distributed AI inference using volunteer GPUs and FLOP token

aslankose1 pts0 comments

GitHub - aslankose/imece: A decentralized AI compute cooperative where contributors earn inference credits by donating idle GPU/CPU time — measured in FLOPs, not crypto. · GitHub

/" data-turbo-transient="true" />

Skip to content

Search or jump to...

Search code, repositories, users, issues, pull requests...

-->

Search

Clear

Search syntax tips

Provide feedback

--><br>We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

Saved searches

Use saved searches to filter your results more quickly

-->

Name

Query

To see all available qualifiers, see our documentation.

Cancel

Create saved search

Sign in

/;ref_cta:Sign up;ref_loc:header logged out"}"<br>Sign up

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session.<br>You signed out in another tab or window. Reload to refresh your session.<br>You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

{{ message }}

aslankose

imece

Public

Notifications<br>You must be signed in to change notification settings

Fork

Star

main

BranchesTags

Go to file

CodeOpen more actions menu

Folders and files<br>NameNameLast commit message<br>Last commit date<br>Latest commit

History<br>2 Commits<br>2 Commits

coordination

coordination

docs

docs

imece-app

imece-app

inference

inference

scripts

scripts

Dockerfile

Dockerfile

README.md

README.md

compose.yml

compose.yml

imece_client.py

imece_client.py

View all files

Repository files navigation

imece

A decentralized AI compute cooperative where contributors earn inference credits by donating idle GPU/CPU time — measured in FLOPs, not crypto.

imece is an open-source framework that allows anyone to contribute idle compute resources in exchange for AI inference credits — denominated in floating-point operations (FLOPs), not cryptocurrency.

The core idea: You donate idle GPU/CPU time → you earn GigaFLOP-Tokens (GFT) → you spend GFT to access AI inference. No speculation. No financial value. Just compute for compute.

Motivation

AI inference is increasingly powerful but increasingly centralized. Access is gated by capital, not contribution. Meanwhile, millions of GPUs sit idle every night across the world, in different time zones, on different grids.

imece turns that idle capacity into a global cooperative — one where the communities that bear the cost of AI infrastructure are also empowered to benefit from it.

A secondary benefit: because contributor nodes are globally distributed across time zones, computation naturally migrates toward regions with low electricity demand and high renewable availability at any given hour — a passive energy-efficiency property that centralized data centers cannot replicate.

How It Works

Contribute idle compute → earn GFT tokens → spend tokens on AI inference

Token formula:

T_earned = FLOPs_delivered × Hardware_Multiplier × Reliability_Factor

Token cost per inference:

T_spent = FLOPs_per_model × Output_tokens × Precision_factor

All tokens are:

Denominated in GigaFLOPs (objective, hardware-agnostic)

Non-transferable and non-tradeable by design

Tied to the wallet that earned them

Architecture

The framework has four components:

Component<br>Role

Contributor Client<br>Benchmarks device, serves transformer model layers, manages token wallet

Coordination Layer<br>Dispatches tasks, assigns hardware multipliers, issues tokens, routes inference

Token Ledger<br>Append-only hash-chained log of all GFT issuance and redemption

Inference Cluster<br>Custom distributed pipeline (volunteer nodes) primary, centralized fallback

Distributed Inference Architecture

imece implements a custom layer-sharding system for distributed inference:

Primary: Volunteer contributor nodes each serve a contiguous slice of transformer layers. Inference requests flow through the pipeline — activations pass from node to node until the final output is generated. Contributors earn tokens proportional to FLOPs delivered. Any HuggingFace-compatible transformer model can be served — LLaMA 3, Mistral, Mixtral, and others.

Fallback: A centralized inference service — used when the volunteer pipeline is unavailable, ensuring reliable access at all times. The current implementation includes a Groq fallback path, with additional providers welcome as community contributions.

This makes the token economy architecturally honest — earned tokens are backed by compute that directly contributes to real AI inference.

Hardware Multiplier Tiers

Hardware Class<br>Example Devices<br>Multiplier

Mobile / Edge<br>Smartphone SoCs, Raspberry Pi<br>0.05×

CPU Only<br>Desktop / server CPUs<br>0.10×

Entry Consumer GPU (integrated)<br>Intel UHD, AMD Radeon integrated<br>0.50×

Mid Consumer GPU (baseline)<br>RTX 3060, RX 6700 XT<br>1.00×

High Consumer GPU<br>RTX 4080, RX 7900 XTX<br>2.00×

Prosumer GPU<br>RTX 4090, RTX 6000 Ada<br>3.00×

Professional Accelerator<br>A40, L40S<br>5.00×

Data Center Accelerator<br>A100,...

inference imece compute idle tokens token

Related Articles