The Technium: Why Are LLMs Smart?
*Lifestream
*The Technium
*Cool Tools
*True Films
*Extrapolations
*Screen Pub
*Quantified Self
*New Rules
*Street Use
*Asia Grace
*Silver Cord
*WINK
FEEDS
Archive "
MOST POPULAR POSTINGS
1,000 True Fans
You Are Not Late
68 Bits of Unsolicited Advice
The Myth of a Superhuman AI
Better Than Free
Scenius, or Communal Genius
The Thinkism Fallacy
12 Assumptions for Extraterrestrial Life
Some Contemporary Heresies
The Post-Productive Economy
ARCHIVES
Why Are LLMs Smart?
Conscious or Not
AIs Want to Be Honest
Your Most Improbable Life
The Emergent Self Loop
Our Uncertain Uncertainties
A Catechism for Robots
Three Modes of Cognition
Six Selfish Reasons to Have Kids
The March of Nines
How Will the Miracle Happen Today?
Essentials for Independent Travel in China
Paying AIs to Read My Books
The Periodic Table of Cognition
The Trust Quotient (TQ)
Emotional Agents
Everything I Know about Self-Publishing
No Limit for Better
Artificial Intelligences, So Far
An Audience of One
Epizone AI: Outside the Code Stack
Best Thing Since Sliced Bread?
Public Intelligence
The Unpredicted vs the Over-Expected
The Self-Domesticated Ape
The Handoff to Bots
50 Years of Travel Tips
101 Additional Advices
Type 2 Growth
Rights / Responsibilities
The Scarcity of the Long-Term
Hill-Making vs Hill-Climbing
Future Embarrassments
The Trust Flip
Things We Didn’t Know About Ourselves
The Boredom Device
Levels of Wealth
The Tradeoffs in AI
God, the Superposition
The Best Since Sliced Bread
Jobs of the Future
The Slow Frontier of Genetic Choice
How to Walk-and-Talk
Paying people to have children
The Missing Monuments of Silicon Valley
The Sphere, a new platform
Dreams are the Default for Intelligence
Cringeworthy in the Future
China’s Immigrant Energy, Underappreciated
12 Assumptions for Extraterrestrial Life
Creative Commons
Why Are LLMs Smart?
A popular way to explain how current LLMs work is to say that "all" they do is predict the next most likely word in a sentence. From one perspective, this is correct. Trained on all human language, the LLMs distilled billions of word sequences so that they can imitate authentic-sounding strings of words that have never been said before. These sentences sound plausible because, based on training on millions of average human texts, the models were predicting what an average human might say next. They really did succeed in doing that expected task.
What is harder to account for is the emergent creative abilities of the LLMs.
The amount of intelligence required to compose one coherent sentence can almost be reduced to the rules in a grade-school grammar book. But the amount of intelligence needed to produce a string of sentences focused on one topic — a paragraph — far exceeds any rules. And the amount of intelligence wrapped up in a string of paragraphs, as in a conversation, begins to approach a pattern we call "thinking." Keep in mind all the work a human needs to do to write a coherent page of text. As researchers scaled up the size and scope of LLMs, they were stunned to find that their systems could begin to imitate the elemental patterns of human thinking found in paragraphs and conversations.
They were shocked because at no point in their invention did they try to program in the elemental process of thinking, or intelligence. They were "merely" extending the patterns of language. The collective surprise of an LLM such as ChatGPT is that by extending the pattern of language, we can arrive at some level of intelligence that is useful beyond language.
If programmers did not program ChatGPT with logical deduction skills, where does the intelligence in its models come from? Why can LLMs behave so intelligently (even if not infallibly), when no one has programmed them to be intelligent? The apparent intelligence of LLMs has been very troubling to experts in the AI field, because there was no theory of intelligence that predicted large models of language would be able to deduce logic, or solve the mathematics of the protein-folding problem.
Intelligence locked in language
One explanation is that the elemental intelligence exhibited by LLMs is locked within human writing and in language itself. You can construct a sentence using a grammar rulebook, but to construct a paragraph you need logic, deduction, and reasoning. And further, as any teacher will tell you, to create a coherent essay — a string of paragraphs — you need some kind of clear thinking. The voluminous training material scooped up by the LLM creators is more than just words, more than just sentences, more than just paragraphs. All the trillion words are embedded in articles, books, essays, rants, replies, comments, tweet-threads, arguments, debates, stories, tales, accounts, reports, blogs. These, and a hundred other long forms, contain intelligence in their arrangement of words. It is the architecture of language that conveys...