The Verification Problem (On OpenAI's Erdős Disproof)

The Verification Problem — Alex Korbonits May 23, 2026 · 11 min read<br>The Verification Problem<br>aimathematicsverification

In October, OpenAI said its model had solved ten open Erdős problems. It hadn’t.1 The model had surfaced existing solutions buried in the literature and presented them as fresh conquests, and Thomas Bloom — who runs erdosproblems.com and would know — said so publicly and without much patience for the framing.2

Last week OpenAI announced that an internal model had disproved Erdős’s planar unit-distance conjecture.3 This time the announcement came with a companion paper written by nine mathematicians, and one of the nine was Thomas Bloom.4

From the outside, the two announcements are indistinguishable: AI solves famous math problem, same company, same Erdős. What separates them is not that the model got smarter between October and May. What separates them is that the second result was verified and the first was not. This post is an attempt to say what verified means, why it was expensive, and what happens to mathematics when the thing producing proofs gets cheaper than the thing checking them. We need to automate the verification.

What the model actually did

The unit-distance problem, posed by Erdős in 1946, asks how many pairs among nnn points in the plane can be exactly distance one apart. A n×n\sqrt{n} \times \sqrt{n}n×n grid, read as Gaussian integers, gives about n1+c/log⁡log⁡nn^{1 + c/\log\log n}n1+c/loglogn pairs — barely faster than linear. For eighty years the belief, Erdős’s own, was that you couldn’t do meaningfully better: the conjectured ceiling was n1+o(1)n^{1 + o(1)}n1+o(1). Bloom had it on a “Top 10 Erdős Problems” list one month before the result landed, written partly to push back on the idea that unsolved Erdős problems were trivia nobody had bothered to attempt. This was not trivia. Many working combinatorial geometers had thought about this problem.

The model disproved it. It produced an infinite family of configurations achieving n1+δn^{1+\delta}n1+δ for a fixed δ>0\delta > 0δ>0 — a polynomial improvement, a different order of magnitude than anyone expected — later pinned by Will Sawin at δ=0.014\delta = 0.014δ=0.014.4 The proof does not come from better geometry. It imports algebraic number theory: infinite class field towers, Golod–Shafarevich theory, the machinery of factorization in field extensions, turned onto an elementary question about dots on a page.

Here is the part the headlines flattened. The construction is, in Bloom’s own assessment, a natural generalization of Erdős’s original grid — natural with eighty years of hindsight, and genuinely hard to find without it. Erdős built his grid from the Gaussian integers, a single fixed field. The model’s move was to replace the fixed field with a tower of fields of growing degree. Sawin and Jacob Tsimerman, both in the companion paper, explain why that inversion is so unintuitive: the natural instinct is to take a sequence of larger and larger point sets inside one field, and if you do the calculation that way, the field you choose doesn’t seem to matter — there’s no signal telling you to vary it. You have to vary the wrong-seeming knob. Tsimerman actually tried this problem, with exactly the increasing-degree idea, and didn’t get there; he calls increasing degree “a very scary dynamic that often doesn’t work out.”4

So: a real disproof of a real conjecture, using deep tools, that several of the experts who checked it believe they could have found and did not.

The texture of the reasoning

OpenAI released a rewritten version of the model’s chain of thought, and it’s the most interesting document in the pile, because it’s the one thing that’s hard to fake and hard to retrieve.5

The model takes the best known construction and starts trying to generalize it, and then spends pages failing in specific, recognizable ways. It tries the Boolean hypercube — subset sums of unit vectors — and notes it gives only nlog⁡nn \log nnlogn, hypercube-scale. It tries powers of the rational point (3+4i)/5 on the unit circle, works through clearing denominators, and lands back at “exactly the divisor-function picture, only linear.” It tries higher-degree algebraic numbers, then Dirichlet units, and kills the units idea with a clean observation about CM fields: the relative unit rank is zero, so norm-one relative units are essentially just roots of unity. Dead end after dead end, each rejected for a stated mathematical reason, not abandoned at random.

And then, on the page Arul Shankar flagged by name in his remarks,4 it turns. The paragraph opens “Suppose optimistically that K is a high-degree CM field in which a fixed rational prime splits completely into principal prime ideals” — and the model immediately sees the prize, calls the resulting construction “frightening,” sketches the point count, and in the same breath stress-tests it against the known n4/3n^{4/3}n4/3 incidence barrier: if the prime were 2 or 5 the construction...

The Verification Problem (On OpenAI's Erdős Disproof)

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

SpaceX not the behemoth everyone thought

The Mirror Is Part of the Machine

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits