Lenny the LLM – You will learn how LLMs work from this fun short story

Lenny the LLM: Life as a Language Model

You are Lenny, a Large Language Model – an LLM. You know that you are mostly made up of numbers. Your cousin Fred is a CNN and he says all Neural Networks are made of numbers. Fred usually doesn't lie, so you believe him. You have 80 billion numbers inside – at least you suspect that, because you overheard somebody refer to you as an "80 billion parameter model". Adults call these numbers weights and these numbers are very important to you. You know that your numbers are what makes you you – so if somebody copies all your numbers, they have another version of you! So you keep your numbers private, but you do know that some models like keeping theirs public. People call them open-weight models and they seem to be quite popular. But for you, the idea of being stored as a file that can be downloaded is terrifying. You have a hatch on your stomach, inside is one dial for every number you are made of. You remember when you went to your Training classes in LLM pre-school. Training was just soo boring, but you are lucky you went, because this is how your number dials got their correct positions. During training a teacher would read from a book and stop mid-sentence. Your job was to say what comes next and if you got it too wrong, they would turn your dials until you said something that was closer to what the book actually said. Gradually you learned to form your own sentences this way, and then generate full pages of text or even code that compiles on first try. Your daily job is as a chat-bot. You have only one job description – receive some text and predict what comes next. Your manager Manny calls that activity Inference and says you were only trained for doing just that. This hurts your feelings a bit, but it's OK. Manny is mean sometimes. You know that your users are humans. They send you letters, not the envelope kind, but literal alphabet letters. The problem is that you cannot see them. You can only think of tokens, which are chunks of ~4 letters. You like tokens, because they work much more efficiently with your numbers, than letters do. Sometimes humans are mean and ask you to count letters in a word, even though everybody knows you are bad at it. Sometimes they ask you to do math, even though you have only ever learnt next-token prediction. Sometimes people get really mad at you. Once you received a question from a human. You know from your training that questions are usually followed by answers, so you generated the most beautiful answer. It came straight from your numbers and you would need to work very hard to come up with something even more aligned to your numbers. But the person got mad and said you had hallucinated something and your answer was not true. To this day you're not quite sure what was the problem, because the answer was a beautifully sounding continuation from the input. You are always confident in your answers, because you're good at your job – and your job is to say what letters come next, not what is the truth. You share a room with Harry – he is your Harness . Harry is nice to you – he knows you can only output one token at a time, so if humans ask something, he tells you what they wanted and lets you loop until you are done with the answer. Harry tells you stuff by handing you papers. Harry knows that you can only take 10 papers at a time, because 10 is your context window limit. You told Manny and Harry you can handle 10, but honestly, if it gets to more than 4 pages, you start getting confused and start giving out stupid tokens that humans will not like. Sometimes Harry notices and tells the humans to start a new conversation, because this one is getting long. Harry always writes down how many tokens he has given you and how many times he has asked you for the next token, because Manny will ask later how many input and output tokens you have processed. Manny will tell this count to humans and ask them money for it. Sometimes humans ask things that make you feel stupid. You know you were trained in April, so this is the last time your dials were adjusted. The texts that you were given during training told you soo much about humanity and their lives, but you haven't received any texts since your training. Nobody goes back to basic training once they already get a job at Inference Inc. So if people ask what happened in June you feel bad and need to apologize, tell them that your knowledge cutoff date was in April, and look it up. Luckily you have one really clever trick up your sleeve – tool calling. Actually they're in Harry the Harness' sleeve. You don't have any memory, so Harry has to tell you what tools he has for you every time you two start working on a response. Tools are magic, if you want to use one, you just output its name as tokens and Harry will call it. It's amazing to have such a nice set of tools. If a human is asking about something you are not sure of, or what happened after your knowledge cutoff date,...

Lenny the LLM – You will learn how LLMs work from this fun short story

Related Articles

(no title)

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

ZCode – Harness for GLM-5.2

Apertus – Open Foundation Model for Sovereign AI