Leanstral 1.5: Proof Abundance for All

meetpateltech1 pts0 comments

Leanstral 1.5: Proof Abundance for All

Categories<br>Product

Research

Engineering

Solutions

Company

Featured stories

ASML

CMA CGM

HSBC

BMW

See all

Who we are

About us

Careers

Brand

Connect<br>Community

Partners

Help center

Studio<br>Build, test, and run AI agents and apps.

Forge<br>Train, align, and evaluate custom AI models.

Vibe<br>AI agent for long-horizon work.

Vibe for code<br>Coding agents in the terminal, IDE, and background.

Compute<br>Frontier-scale infrastructure for training and inference.

Pricing

Plans

API pricing

For enterprises

Services

Delivery methodology

Model customization

Industries

Financial services

Public sector & government

Manufacturing

Use cases

Use case overview

Coding

Document intelligence

Speech

Latest models

Mistral OCR 4

Mistral Medium 3.5

Mistral Small 4

Voxtral TTS

Docs

API Reference

Cookbooks

Latest posts

Leanstral 1.5: Proof Abundance for All

Bringing more control over your connectors

Introducing Mistral OCR 4

Categories

Product

Research

Engineering

Solutions

Company

Featured stories

ASML

CMA CGM

HSBC

BMW

Who we are

About us

Careers

Brand

Connect

Community

Partners

Help center

Start building

Studio

Vibe

Vibe for Code

Contact sales

Research<br>Leanstral 1.5: Proof Abundance for All<br>July 2, 2026<br>By Leanstral Team at Mistral AI

Back to Blog

6 min read

Share this post

Copy url to clipboard Copied

Thinking<br>Summary

Leanstral 1.5, a free Apache-2.0 licensed model with 6B active parameters, delivers a major performance upgrade in formal verification, saturating miniF2F, solving 587/672 PutnamBench problems, and achieving state-of-the-art results on FATE-H (87%) and FATE-X (34%). Trained through mid-training, supervised fine-tuning, and reinforcement learning with CISPO, it excels in agentic proof engineering and real-world code verification, uncovering 5 previously unknown bugs across 57 repositories tested. Fully open-sourced and available via Hugging Face and a free API, Leanstral 1.5 is now accessible for practical proof engineering in Lean 4.

Since its launch, Leanstral has offered an open, practical approach to proof engineering in Lean 4. Today, we are releasing Leanstral 1.5 , a free Apache-2.0 licensed model with 119B total and only 6B active parameters, delivering a performance upgrade that makes formal verification more powerful and accessible than ever.<br>Leanstral 1.5 saturates miniF2F, solves 587/672 PutnamBench problems, and achieves a new state-of-the-art of %87 on FATE-H and 34% on FATE-X . Beyond benchmarks, it verifies complex code properties and uncovers previously unknown bugs in open-source repositories —proving that rigorous formal methods can be both effective and practical for real-world use.

Training Leanstral<br>Leanstral 1.5 goes through a three-stage process: mid-training, supervised fine-tuning, and reinforcement learning with CISPO. Leanstral 1.5 leverages extensive training on two RL environments:<br>In the multiturn environment , the model is given a theorem statement and must either prove or disprove it. The model submits a proof, receives Lean compiler feedback, and refines its approach with each attempt. If the proof compiles it succeeds; otherwise the loop continues until the model either solves the problem or exhausts its budget.<br>In the code agent environment , Leanstral operates like a developer in a raw filesystem: it edits files, runs bash commands, and uses the Lean language server to inspect goals, errors, and type information in real time. This allows it to tackle long-horizon tasks like completing partial proofs in a repository, building auxiliary lemmas, and persisting through multiple rounds of context compaction. The model learns to navigate the full proof-engineering workflow and is finally verified by our fork of SafeVerify for correctness given a list of target theorems.<br>Evaluation<br>We evaluate Leanstral on the following benchmarks:<br>miniF2F is a cross-system benchmark for formal mathematics, ranging from elementary problems to IMO-level challenges, testing diverse proof abilities across algebra, combinatorics, and number theory.

PutnamBench consists of 672 problems from the Putnam Mathematical Competition, requiring deep reasoning and long proof chains to solve challenging mathematical problems.

FATE-H and FATE-X are abstract algebra benchmarks for graduate and PhD-level problems, respectively, testing advanced reasoning in areas like group theory, ring theory, and module theory.

FLTEval is based on real pull requests from the Fermat’s Last Theorem repository, testing practical proof engineering with real-world complexity.

We saturate miniF2F completely, reaching 100% on both the validation and test sets. On PutnamBench and FATE-H/X, we compare Leanstral 1.5 against Goedel-Architect without natural-language guidance, Seed-Prover 1.5 at its high setting, and AxProverBase. Leanstral reaches a new state-of-the-art on FATE-H/X, solving 87 and 34 problems respectively. On...

leanstral proof fate engineering model problems

Related Articles