Turning brain prediction models into testable explanations

grajmanu1 pts0 comments

Understanding the brain with AI-driven explanations and experiments - Microsoft Research

Skip to main content

Research

Publications<br>Code & data<br>People<br>Microsoft Research blog

Artificial intelligence<br>Audio & acoustics<br>Computer vision<br>Graphics & multimedia<br>Human-computer interaction<br>Human language technologies<br>Search & information retrieval

Data platforms and analytics<br>Hardware & devices<br>Programming languages & software engineering<br>Quantum computing<br>Security, privacy & cryptography<br>Systems & networking

Algorithms<br>Mathematics

Ecology & environment<br>Economics<br>Medical, health & genomics<br>Social sciences<br>Technology for emerging markets

Academic programs<br>Events & academic conferences<br>Microsoft Research Forum

Behind the Tech podcast<br>Microsoft Research blog<br>Microsoft Research Forum<br>Microsoft Research podcast

About Microsoft Research<br>Careers & internships<br>People<br>Emeritus program<br>News & awards<br>Microsoft Research newsletter

Africa<br>AI for Science<br>AI Frontiers<br>Asia-Pacific<br>Cambridge<br>Health Futures<br>India<br>Montreal<br>New England<br>New York City<br>Redmond

Applied Sciences<br>Mixed Reality & AI - Cambridge<br>Mixed Reality & AI - Zurich

Register: Research Forum

Microsoft Security<br>Azure<br>Dynamics 365<br>Microsoft 365<br>Microsoft Teams<br>Windows 365

Microsoft AI<br>Azure Space<br>Mixed reality<br>Microsoft HoloLens<br>Microsoft Viva<br>Quantum computing<br>Sustainability

Education<br>Automotive<br>Financial services<br>Government<br>Healthcare<br>Manufacturing<br>Retail

Find a partner<br>Become a partner<br>Partner Network<br>Microsoft Marketplace<br>Software companies

Blog<br>Microsoft Advertising<br>Developer Center<br>Documentation<br>Events<br>Licensing<br>Microsoft Learn<br>Microsoft Research

View Sitemap

Return to Blog Home<br>Microsoft Research Blog

At a glance

LLM-based models can predict the human brain’s responses to language with high accuracy. But what drives that performance is essentially unreadable: a vast collection of learned parameters, not scientific theories anyone can read.

Generative causal testing (GCT), developed in a collaboration between Microsoft Research, the University of California, Berkeley, the University of California, San Francisco, and Columbia University, distills these brain-prediction models into short verbal explanations of what each patch of cortex responds to: phrases like “food preparation” or “location names.”

GCT then closes the loop: an LLM writes new stories designed to activate a targeted brain area, subjects hear them in the scanner, and the region lights up only if the explanation is right.

In experiments, GCT confirmed known selectivity, teased apart neighboring place-processing regions long thought interchangeable, and revealed tiny prefrontal “micro-regions” tuned to specific concepts like dialogue, clock times, and measurements.

The explainability problem in language neuroscience

Over the past decade, LLMs have become the most accurate tools we have for predicting how the human brain responds to language. Feed an LLM the same story a person hears in an fMRI scanner, and the model’s internal representations can predict the activity of individual patches of cortex with remarkable fidelity. But this success comes with a catch: nobody can read these models. They are millions of inscrutable parameters that can’t be directly translated into interpretations. A model that predicts brain activity tells us that a region responds to language, but not what it is actually picking up on, whether it’s food, places, numbers, or something else entirely. As black-box models spread, the gap between prediction and understanding has become one of the central problems in computational neuroscience.

Turning black boxes into testable theories

In a new paper accepted in Nature Neuroscience, Microsoft Research scientists, in collaboration with scientists at the University of California, Berkeley, University of California, San Francisco, and Columbia University, introduce a framework to overcome this explainability crisis: generative causal testing (GCT). GCT distills brain-prediction models into short, readable accounts of what each patch of cortex responds to, then tests those claims. An LLM writes new stories engineered to activate a specific brain area, subjects hear them in the scanner, and if the explanation is correct, the targeted region lights up. The result is a method that translates uninterpretable predictive models back into the currency of science: concise hypotheses that can be confirmed or refuted in a follow-up experiment. An LLM writes new stories engineered to activate a specific brain area, subjects hear them in the scanner, and if the explanation is correct, the targeted region lights up. The result is a method that translates uninterpretable predictive models back into the currency of science: concise hypotheses that can be confirmed or refuted in a follow-up experiment.

Figure 1. The two steps of generative causal testing (GCT). In Step 1, the phrases that most strongly drive a brain region’s predictive model are summarized by an LLM into a...

microsoft research brain models university blog

Related Articles