Why the Human Genome's Tangled Physicality May Confound AI

pseudolus1 pts0 comments

Why the Human Genome’s Tangled Physicality May Confound AI | Quanta Magazine

About Quanta

Search

Search for:

Search<br>Search

Newsletter

Get the latest news delivered to your inbox.

Email

Subscribe

Recent newsletters

Follow Quanta

Facebook

Youtube

Instagram

RSS

An editorially independent publication supported by the Simons Foundation.

Type search term(s) and press enter

What are you looking for?

Search

Home

Why the Human Genome’s Tangled Physicality May Confound AI

Comment

Save Article

Read Later

Share

Facebook

Copied!

Copy link

Email

Pocket

Reddit

Ycombinator

Comment

Comments

Save Article<br>Read Later

Read Later

explainers

Why the Human Genome’s Tangled Physicality May Confound AI

By

Philip Ball

June 18, 2026

Our genetic heritage is not a blueprint or an algorithm, as many biologists have imagined, but something else entirely.

Comment

Save Article

Read Later

Samuel Velasco and Hannah Waters/Quanta Magazine

Introduction

By Philip Ball

Contributing Writer

June 18, 2026

View PDF/Print Mode

artificial intelligence

biology

computer science

DNA

eukaryotes

explainers

features

gene regulation

genes

genome

genomics

proteins

RNA

transcription

All topics

Since its molecular structure was deduced in the 1950s, DNA has been hailed by many biologists as the secret of life. They’ve read and studied the information stored in the DNA found in the cells of living organisms, known as their genomes, and claimed that this genetic database must be some kind of blueprint, code script, or computer. But if DNA really does harbor some greater secret about how life works, biologists have yet to find it.

In fact, the human genome is less a script than a puzzle that gets harder the closer they look. Knowing the entire sequence — the order of all 3 billion or so of our DNA’s chemical building blocks, nearly fully deduced by the international Human Genome Project between 1990 and 2003 — hasn’t helped much. That investigation showed that barely 2% of the human genome consists of actual genes, the information-coding sequences of DNA.

It’s now clear that understanding the human genome is no longer a matter of figuring out what each gene does. The deeper and much harder question is how those genes are used, or regulated, a question that seems to involve some and perhaps much of the rest of the genome. By switching suites of genes on and off, the many different cell types in our bodies can all be created from the same material. Cells also regulate their genes from moment to moment in response to a constant inflow of signals from their neighbors and surroundings. But the processes that govern gene regulation are proving so complex that some biologists wonder whether a full understanding of it — of how the genome really works — will ever be within the grasp of our puny minds.

Some are counting on outsourcing the analysis to artificial intelligence. Genomic “foundation models” such as Evo 2, Genos, and Google DeepMind’s AlphaGenome are trained on vast quantities of genomic data, which biologists use to make predictions about how differences in DNA sequence affect biological processes and ultimately the traits (including disease risk) of a whole organism. These algorithms don’t worry about the complicated regulatory stuff going on; all of that is supposedly subsumed by the algorithm’s “training,” through which it deduces correlations from cases we already know about.

This approach is likely to be useful, but for those who crave real understanding of how the genome, and ultimately life itself, works, a computational black box will never suffice. And perhaps more to the point, the genome might not submit to the kind of straightforward input-output approach that such AI models ultimately assume.

That’s because the genome is no blueprint or algorithm. It is something else.

The Old View

Given that it’s the product of around 4 billion years of evolution, perhaps it’s not surprising that our genome is complicated. The surprise has been what those complications are. “Our genome is not what we might make it if we sat down at the drawing board,” said the biologist Karen Adelman, who studies gene regulation at Harvard Medical School.

We’ve stopped thinking about the genome as a linear piece of DNA code.

Wendy Bickmore, University of Edinburgh

The traditional view posits that a small proportion of our DNA holds the code for making the protein molecules that orchestrate our cells’ chemistry. Each instruction for a protein is held in a corresponding gene — we have around 20,000 of these — and gene sequences can range in length from a couple of dozen to almost 3 million DNA “letters” (representing molecules called nucleotides). Making a protein from its gene is a two-stage affair. First the DNA is read, letter by letter, by an enzyme called a polymerase, which creates a copy of that code in a related molecule called messenger RNA (mRNA). This is called transcription. The mRNA is then read...

genome human read gene search from

Related Articles