Social Animus
May 28th, 2026 @
justine's web page<br>Social Animus
recent photo by<br>@thepsyence
The most difficult challenge to working in open source is that there's<br>no institutional screening process, since the goal is to just let people<br>organize themselves and build things. This has meant that many of the<br>people who get involved have never had the opportunity to work with the<br>most exemplary members of each group the world has to offer. During the<br>culture wars of the 2010s, the first person who tried to solve the<br>problem of how to include these uncommon individuals was Coraline Ada<br>Ehmke, who wrote the<br>Contributor<br>Covenant. I always thought her solution went too far, since I found<br>a much<br>easier answer for my own project, which has been to never accept<br>anonymous contributions and to not merge a single line of code until the<br>contributor sends an email promising to assign me copyright.
I always thought my security posture was too paranoid, so when llama.cpp<br>came out in 2023, I found the code Gerganov wrote to be so beautiful<br>that I did the one thing that I promised myself I would never do, which<br>was collaborate with an anonymous developer from his team named Slaren.<br>This was the first time in five years that I wrote a change with someone<br>on a project that wasn't my own. After<br>submitting our<br>work he went on 4chan afterwards and accused<br>me plagiarism, saying that even my changes were his own. The way the<br>community reacted is an interesting case study into the guile some<br>developers have learned since the culture war, because the locus of<br>thought for llama.cpp has always been on 4chan. They were the ones who<br>originally leaked the Meta LLaMA v1 weights. You can map the way<br>developers talk on that board to their anonymous accounts on GitHub. I<br>actually developed migraines for the first time in my life and ended up<br>in the hospital (since I didn't have health insurance and had to wait in<br>the ER) due to the eye strain of reading unfiltered thoughts about me<br>for months. It's unusual because the community originally reacted<br>positively towards my work, until one of its members felt threatened by<br>me, and since they're all anonymous there's not much proof it wasn't<br>just a few guys. This was the reason<br>Wendy Hanamura<br>cited when she canceled my invitation to speak<br>at the Internet Archive.
In any case, I'm really happy that these back channels exist, because<br>the greatest competitive advantage I've ever had was to monitor which<br>pull requests people on 4chan complained about, and then merge them into<br>llamafile before<br>Gerganov could. This is how my<br>Mozilla Builders project<br>shipped support for new models like Gemma 2 before any other grassroots<br>project. I got hundreds of thousands of downloads on Hugging Face. There<br>were so many downloads that Mozilla couldn't believe it, because so few<br>people showed up on our issue tracker. Mozilla was sponsoring my work<br>because they want to support the community, and as far as anyone could<br>tell, there wasn't one. I always thought this happened because my code<br>was just that good. In a past life, when I was originally trained to<br>write kiosk software for reverse vending machines in Java, no one ever<br>contacted the vendor unless there was something wrong, and since<br>llamafile is an ex nihilo project that I worked on for six<br>years, beginning with an empty file and an assembler, I had plenty of<br>time to pin down most of the bugs on my own.
I even wrote a blog post giving Slaren more credit,<br>because it instilled in him a false sense of confidence that led him to<br>tackle harder problems, like multiplying three dimensional numbers. To<br>fix the performance issues with mixture of experts models that caused, I<br>tried to upstream my tinyBLAS tensor multiplication code in<br>PR #6840<br>and it's a great example of what it's like to work with me. Gerganov's<br>doctoral advisor was Iwan Kawrakow, who was the power behind the throne<br>on that project. He invented the "K" quantization formats many people<br>use to compress their weights. He was curious about my change and I told<br>him that he'd be able to build better matrix multiplication kernels than<br>me if he used my block tiling technique with his quants.
llamafile ended up receiving an avalanche of pull requests from Iwan<br>that were licensed Apache 2.0 so that Gerganov couldn't use them. This<br>enabled us to have faster cpu inference than any other project. That<br>meant consumers and businesses stood a better chance of being able to<br>use LLMs without needing to purchase expensive GPUs. We made that<br>happen, even though the llama.cpp team had more than a million dollars<br>of funding, and were successfully acquired by Hugging Face after Iwan<br>had moved on to<br>start his own project.
Hacker News is my favorite place on the web, because it's the last<br>bastion of curiosity online. This was the cheat code I used to restart<br>my career in 2020. The first thing I did was I wrote an article about<br>αcτµαlly pδrταblε<br>εxεcµταblε and I hosted it in a Google Cloud Storage bucket. When<br>users voiced...