The Technium: Why Are LLMs Smart?

*Lifestream

*The Technium

*Cool Tools

*True Films

*Extrapolations

*Screen Pub

*Quantified Self

*New Rules

*Street Use

*Asia Grace

*Silver Cord

*WINK

FEEDS

Archive "

ARCHIVES

Why Are LLMs Smart?

Conscious or Not

AIs Want to Be Honest

Your Most Improbable Life

The Emergent Self Loop

Our Uncertain Uncertainties

A Catechism for Robots

Three Modes of Cognition

Six Selfish Reasons to Have Kids

The March of Nines

How Will the Miracle Happen Today?

Essentials for Independent Travel in China

Paying AIs to Read My Books

The Periodic Table of Cognition

The Trust Quotient (TQ)

Emotional Agents

Everything I Know about Self-Publishing

No Limit for Better

Artificial Intelligences, So Far

An Audience of One

Epizone AI: Outside the Code Stack

Best Thing Since Sliced Bread?

Public Intelligence

The Unpredicted vs the Over-Expected

The Self-Domesticated Ape

The Handoff to Bots

50 Years of Travel Tips

101 Additional Advices

Type 2 Growth

Rights / Responsibilities

The Scarcity of the Long-Term

Hill-Making vs Hill-Climbing

Future Embarrassments

The Trust Flip

Things We Didn’t Know About Ourselves

The Boredom Device

Levels of Wealth

The Tradeoffs in AI

God, the Superposition

The Best Since Sliced Bread

Jobs of the Future

The Slow Frontier of Genetic Choice

How to Walk-and-Talk

Paying people to have children

The Missing Monuments of Silicon Valley

The Sphere, a new platform

Dreams are the Default for Intelligence

Cringeworthy in the Future

China’s Immigrant Energy, Underappreciated

12 Assumptions for Extraterrestrial Life

Creative Commons

Why Are LLMs Smart?

A popular way to explain how current LLMs work is to say that "all" they do is predict the next most likely word in a sentence. From one perspective, this is correct. Trained on all human language, the LLMs distilled billions of word sequences so that they can imitate authentic-sounding strings of words that have never been said before. These sentences sound plausible because, based on training on millions of average human texts, the models were predicting what an average human might say next. They really did succeed in doing that expected task.

What is harder to account for is the emergent creative abilities of the LLMs.

The amount of intelligence required to compose one coherent sentence can almost be reduced to the rules in a grade-school grammar book. But the amount of intelligence needed to produce a string of sentences focused on one topic — a paragraph — far exceeds any rules. And the amount of intelligence wrapped up in a string of paragraphs, as in a conversation, begins to approach a pattern we call "thinking." Keep in mind all the work a human needs to do to write a coherent page of text. As researchers scaled up the size and scope of LLMs, they were stunned to find that their systems could begin to imitate the elemental patterns of human thinking found in paragraphs and conversations.

They were shocked because at no point in their invention did they try to program in the elemental process of thinking, or intelligence. They were "merely" extending the patterns of language. The collective surprise of an LLM such as ChatGPT is that by extending the pattern of language, we can arrive at some level of intelligence that is useful beyond language.

If programmers did not program ChatGPT with logical deduction skills, where does the intelligence in its models come from? Why can LLMs behave so intelligently (even if not infallibly), when no one has programmed them to be intelligent? The apparent intelligence of LLMs has been very troubling to experts in the AI field, because there was no theory of intelligence that predicted large models of language would be able to deduce logic, or solve the mathematics of the protein-folding problem.

Intelligence locked in language

One explanation is that the elemental intelligence exhibited by LLMs is locked within human writing and in language itself. You can construct a sentence using a grammar rulebook, but to construct a paragraph you need logic, deduction, and reasoning. And further, as any teacher will tell you, to create a coherent essay — a string of paragraphs — you need some kind of clear thinking. The voluminous training material scooped up by the LLM creators is more than just words, more than just sentences, more than just paragraphs. All the trillion words are embedded in articles, books, essays, rants, replies, comments, tweet-threads, arguments, debates, stories, tales, accounts, reports, blogs. These, and a hundred other long forms, contain intelligence in their arrangement of words. It is the architecture of language that conveys...

The Technium: Why Are LLMs Smart?

Related Articles

Apple WWDC 2026 Livestream

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org