Checking the math behind OpenAI and Anthropic's latest headlines

Checking the math behind OpenAI and Anthropic’s latest headlines

Marcus on AI

SubscribeSign in

Checking the math behind OpenAI and Anthropic’s latest headlines Always read the fine print

Gary Marcus May 21, 2026

OpenAI scored a big result yesterday, with respect to an Erdos problem:

OpenAI@OpenAI

Today, we share a breakthrough on the planar unit distance problem, a famous open question first posed by Paul Erdős in 1946.

For nearly 80 years, mathematicians believed the best possible solutions looked roughly like square grids.

An OpenAI model has now disproved that

7:06 PM · May 20, 2026 · 6.93M Views

688 Replies · 2.64K Reposts · 19.2K Likes

Clearly impressive. But as with so much else, it should be viewed with skepticism. In an email to me this morning, Cal Newport made a number of good points that he said I could share, summarizing both what was found and some limitations: OpenAI used a new reasoning model (not yet released) to help identify a counterexample that disproved a conjecture from discrete geometry first proposed by Paul Erdos 80 years ago. The model was tuned for so-called chain-of-thought reasoning where the model endlessly “thinks out loud” about whatever it is trying to solve, an approach that lets you approximate something like memory and dynamic computation using LLMs, which are otherwise static and feed-forward. Professional mathematicians identified the counterexample from within a long transcript of the model’s reasoning, and then extracted the key parts and rewrote it as a more succinct proof in a more standard style. How was the model able to solve something that human mathematicians had failed to do? In a companion article released by OpenAI, the mathematician Thomas Bloom, who reviewed the full model output, identified the factors that came together to make this counterexample ripe for LLM-aided discovery. He noted that though the conjecture is old, those who have worked on it have largely shared Erdos’s original belief that it was true and therefore focused on trying to solve it. What the LLM-based tool did instead was to systematically apply and extend existing techniques in search of evidence that the conjecture was false. Here’s Bloom: “[the AI’s] success here echoes previous achievements: it often produces the most surprising results by persevering down the paths that a human may have dismissed as not worth their time to explore, combining superhuman levels of patience with familiarity with a vast array of technical machinery.” A few observations of my own: (1) Non-mathematicians might not be familiar with the degree to which LLM-technology has been combined with existing computer-aided math tools in recent years to seek new math results through the systematic and patient exploration of techniques and corners of problem spaces that are too exhausting to interest most human mathematicians. The real technical headline of the new OpenAI result, therefore, is that chain-of-thought reasoning was able to accomplish this type of systematic solving without the much more intricate scaffolding used in most of these existing tools. That being said, the internal model used here, which many assume is OpenAI’s response to the truly massive Mythos LLM, is likely similarly massively expensive to prompt. The future of AI-assisted math will likely focus on smaller, cheaper, math-tuned LLMs combined with more powerful scaffolding. So, this experiment might be more about marketing the power of their new model than trying to actually advance computer-aided math. (2) I don’t think it’s accurate to say these examples of AI-supported mathematics mean the models are somehow “smarter” than human mathematicians. I think a better analogy might be how computer tools helped architects produce much more daring and complicated designs (like the Frank Gehry-designed Stata Center where I did my CS doctoral and postdoctoral work at MIT). These tools weren’t better architects than humans but made humans more capable architects. (3) From a business perspective, I actually think this announcement isn’t necessarily good news for OpenAI. There are few markets smaller and less lucrative than professional academic mathematics. The fact that this is the area where OpenAI is dedicating some of their top technical talent (like Noam Brown) underscores the degree to which, like the drunk searching for their keys under the streetlight, their most impressive results are limited to the smaller number of areas that are well-suited to LLMs (i.e., math + computer coding). If this model was brilliant in some more general way, obviously the better examples would be solving problems or automating processes that directly and obviously generate massive revenue or savings for the specific types of companies they hope to make their customers. In conclusion: AI’s role in math is genuinely important and exciting. I can think of any number of results I’ve worked on in my career where I could have moved faster or...

Checking the math behind OpenAI and Anthropic's latest headlines

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

Old Reddit Is Down