Introducing HRM-Text

aziis981 pts0 comments

Introducing HRM-Text - sapient.inc

Try HRM<br>ResearchIntroducing HRM-Text<br>Sapient<br>May 18, 2026

Back To All<br>HRM-Text is an ultra-lean 1B reasoning language model that achieves strong reasoning performance at a fraction of the compute and data required by conventional approaches. It is available today as a fully open release.

Today we are releasing HRM-Text, a 1.15-billion-parameter hierarchical reasoning language model trained on approximately 40 billion tokens of structured data. That is up to 1,000× fewer tokens than the 4–36 trillion used in typical open pretraining runs — and the entire model can be pretrained in roughly one day for about a thousand dollars.

1B

Parameters

40B

Training tokens

~$1K

Pretraining cost

0.6 GiB

Size at int4

Despite this dramatically smaller training footprint, HRM-Text is competitive with models trained at far greater cost and scale. In independent verification conducted in April 2026, it achieved 56.2% on MATH, 82.2% on DROP, 81.9% on ARC-Challenge, and 60.7% on MMLU.

MATH

56.2%

Multi-step mathematical reasoning

DROP

82.2%

Discrete reasoning over paragraphs

ARC-Challenge

81.9%

Science & commonsense reasoning

MMLU

60.7%

57-domain general knowledge

Benchmark Performance

The table below compares HRM-Text 1B against models trained on orders of magnitude more data and compute, including open-weight and proprietary systems. HRM-Text results reflect the base model only — no post-training, fine-tuning, or reinforcement learning has been applied.

A note on these results: HRM-Text is a proof-of-concept base model and has not undergone post-training or reinforcement learning. The comparison models listed above have all received extensive fine-tuning and alignment training, which substantially shifts benchmark results. The numbers above reflect the architecture itself — the ceiling is higher than what these figures show.

How HRM-Text Works

Task Completion

Conventional models learn by predicting the next token. Every word in the training data carries equal weight — filler phrases, function words, key reasoning steps, all treated the same. We train HRM-Text on structured instruction-response pairs. So the model learns from reasoning steps and solutions, not from surface language. It is not learning what word comes next. It is learning how to complete a task.

This gives the model the ability to quickly identify and derive patterns and rules from similar data structures and quickly develop skills through them, thus resulting in less reliance on massive data. So, through task-completion, the model learns more like a human being. It achieves more efficient and smarter learning through smaller samples.

Latent Space Reasoning

HRM-Text performs reasoning in continuous latent space, rather than relying on long chains of visible intermediate reasoning tokens. Its reasoning process takes place inside the model’s internal representation space, allowing more computation to happen before the final answer is produced.

This is important because reasoning depth does not need to be expressed as longer visible output. Instead of externalizing every intermediate step as text, HRM-Text carries out additional reasoning internally within a single forward pass. This enables deeper computation while keeping outputs concise and token-efficient.

By placing more reasoning inside continuous latent computation, HRM-Text reduces dependence on long reasoning traces and large output-token budgets. This contributes to both its reasoning efficiency and its suitability for compact, local deployment.

Hierarchical Latent Recurrent Architecture

HRM-Text implements latent-space reasoning through a hierarchical latent recurrent architecture built from two levels: High-level (H) and an Low-level (L). In a single forward pass, the model performs 2 H updates and 6 L updates, for a total of 8 stack iterations. The L-stack updates more frequently than the H-stack, creating a multi-timescale reasoning structure. Cross-level information flows at every recurrent step, allowing the model to think in its internal state before producing outputs.

This architecture increases reasoning computation per token without proportionally increasing parameter count. Instead of scaling capability only by making the model larger, HRM-Text reuses parameters through recurrence, allowing deeper internal computation while keeping the model compact.

Key Capabilities

Data-Efficient Training

Trained on ~40B tokens, using up to 1000× less data than the 4–36T tokens used by the models we benchmark against.

Compact Yet Powerful

Built with 1.15B parameters while remaining competitive with models several times its size on reasoning-heavy benchmarks.

Native Edge Reasoning

Runs locally with a 0.6 GiB footprint at int4 quantization, enabling advanced reasoning without cloud dependency.

Broader Impact

Accessibility

By dramatically reducing the compute and infrastructure required to train and run a capable reasoning model,...

reasoning text model data training tokens

Related Articles