RickCarlino.com
Senior software engineer blogging about software systems, computing history, and practical engineering.
Gnutella Explanation
A Protocol Outlives the World That Created It
This blog serves as my unreasonable and overly enthusiastic love letter to Gnutella, the greatest peer-to-peer project of all time.
Gnutella has the story of a decentralized technology adopted by millions of casual users who did not care to learn what a peer-to-peer system was. Users showed up because the protocol solved real problems at scale and the solution just so happened to be decentralized.
No one ever pretended to use Gnutella in hopes their GnutellaCoinTM would go up in value later. They just downloaded MP3s.
Despite its meteoric rise and its role as a driving force behind the file-sharing phenomenon of the 2000s, Gnutella has gone mostly forgotten. Some of that is because it was a component technology hidden beneath more visible projects like LimeWire. The other half of this is that the walled garden model of modern platforms means most internet users don't even remember what a filesystem is anymore.
The Gnutella project began as an internal demo that leaked to the public after its corporate overlord, AOL, cancelled the project. Owing to its server-free decentralized design, it was impossible to put the toothpaste back in the figurative tube after it reached the public. It grew explosively for a decade and still works today despite years of attempts to stop it. Copies of the original Gnutella.exe are out there on archive.org if you dig for them.
Many have wrongly asserted that Gnutella failed, but that's not a fair representation of what happened. Gnutella scaled to mainstream adoption (millions of concurrent active users) and thrived for a solid decade. The true reason for its fall from the mainstream was simply that the world it was born into disappeared.
Gnutella stood the test of time and solved problems for a software user that no longer exists. It's still there today, chugging along at reduced capacity.
Normy-fication of Internet Usage in a New Millennium
The early 2000s represented a strange transition period for US consumers. Internet adoption hit 50% sometime around 2000-2001. The internet was slowly mutating from a complicated tool for nerds into a mainstream part of daily life. Music file sharing became a common practice during this time for a number of reasons:
The music industry refused to adapt to changing consumer preference.
MP3 players and solid-state data storage became affordable and ubiquitous.
Low-speed dial-up internet made music streaming unfeasible.
Managing disk space, directories, backups, and downloaded files was still a palatable and acceptable activity for even casual computer users of the era.
These conditions set the stage for a golden era that lasted into the early 2010s. If you do not believe me, ask anyone over the age of 35 about their LimeWire memories. I was there, man. It was wild.
Gnutella's lack of single points of failure makes it difficult to kill and the base protocol, though simple, was easily extended via optional protocol extensions baked into the spec.
What the Protocol Actually Does
For most Gnutella was a file transfer tool. This categorization misses a basic function of the protocol. At its core, Gnutella is just a peer-to-peer search engine for blobs.
We could have used it as a poor man's DNS system, or a global metadata lookup table for key/value pairs, or a matchmaking service for your Unreal Tournament league, but that never really happened. Gnutella was good at providing file downloads that matched search queries, and that is what history remembers it for. Loads and loads of easy downloads. Usually MP3s.
Resources (things shared on Gnutella) can be anything:
mappings to other resources, cryptographic
keys, files of any type, meta-information on keyable resources, etc.
--- Gnutella 0.6 draft spec
The process worked as follows:
You opened a desktop application that spoke Gnutella, such as LimeWire, BearShare, or GTK-Gnutella.
The client connected to a handful of peers somewhere on the internet (I will explain how you found them later).
You typed something into a search box, like LinkinPark.mp3.exe.
Your query spread outward through the network from peer to peer.
Results slowly trickled back from random computers around the world.
You inspected filenames, guessed which results were fake, compared connection speeds, and hoped none of them were viruses.
Once you picked a file, your client downloaded pieces of it directly from user's computer over HTTP.
Sometimes you downloaded the wrong thing and accidentally discovered new content. Or malware, you never really knew for sure. This foraging behavior has disappeared with the advent of recommendation engines and we lost a tiny bit of internet magic in the process.
Clients usually offer 3-4 main features, mirroring the 5 main message types of the underlying protocol:
A query...