Mistral OCR 4 : SOTA OCR for Document Intelligence
Categories<br>Product
Research
Engineering
Solutions
Company
Featured stories
ASML
CMA CGM
HSBC
BMW
See all
Who we are
About us
Careers
Brand
Connect<br>Community
Partners
Help center
Studio<br>Build, test, and run AI agents and apps.
Forge<br>Train, align, and evaluate custom AI models.
Vibe<br>AI agent for long-horizon work.
Vibe for code<br>Coding agents in the terminal, IDE, and background.
Compute<br>Frontier-scale infrastructure for training and inference.
Pricing
Plans
API pricing
For enterprises
Services
Delivery methodology
Model customization
Industries
Financial services
Public sector & government
Manufacturing
Use cases
Use case overview
Coding
Document intelligence
Speech
Latest models
Mistral OCR 4
Mistral Medium 3.5
Mistral Small 4
Voxtral TTS
Docs
API Reference
Cookbooks
Latest posts
Introducing Mistral OCR 4
AI Now Summit 2026
Vibe gets to work.
Categories
Product
Research
Engineering
Solutions
Company
Featured stories
ASML
CMA CGM
HSBC
BMW
Who we are
About us
Careers
Brand
Connect
Community
Partners
Help center
Start building
Studio
Vibe
Vibe for Code
Contact sales
Research<br>Introducing
OCR 4
June 23, 2026<br>By Mistral AI
Back to Blog
10 min read
Share this post
Copy to clipboard Copied
Title
Subtitle
Text
Image
Table
Today, we're releasing Mistral OCR 4, featuring bounding boxes, block classification, and inline confidence scores alongside extracted text. The model supports 170 languages across 10 language groups, runs in a single container for fully self-hosted deployments, and serves as an ingestion component for enterprise search, RAG, and domain-specific retrieval pipelines. OCR 4 is a small, focused model, and this post covers what's new, how it performs on public and internal benchmarks, the known limitations of those benchmarks, and guidance on when to use the model API versus Document AI.
Highlights<br>Breakthrough performance. Independent annotators prefer OCR 4 over every leading OCR and document-AI system tested, with win rates averaging 72%, alongside the top overall score on OlmOCRBench (85.20). See Benchmarks below for methodology and known scoring limitations.
Segmentation, not just text. Alongside the extracted text, OCR 4 returns bounding boxes, typed-block classification (titles, tables, equations, signatures, and more), and inline confidence scores. Bounding boxes, our most-requested capability, localize text for in-context highlighting and reliable data pipelines. At the same time, block types and confidence scores drive source-grounded citations, redactions, and human-in-the-loop verification.
Integrated with Mistral Search Toolkit (public preview). OCR 4 is an ingestion component of Search Toolkit, Mistral's open-source, composable search framework, announced at the AI Now Summit. Its structured output supplies citation-ready inputs to the toolkit's ingestion, retrieval, and evaluation workflow for RAG and enterprise search.
Multilingual coverage. Support for 170 languages across 10 language groups, with measurable gains on rare and low-resource languages where several competing systems degrade.
Run on your own infrastructure. OCR 4 is compact enough to deploy on a single container, keeping document data in your environment for residency, sovereignty, and compliance, while supporting cost-efficient, high-throughput batch processing. Self-managed deployment is available to enterprise customers.
Overview<br>Mistral OCR 4 extracts and structures content from a wide range of documents. Where previous generations focused on converting a page into clean text and tables, OCR 4 returns a structured representation of the document. Each block is localized with a bounding box, classified by type, and inline confidence scores are generated per-page and per-word. Downstream systems, therefore, have access not only to what the document says but also to where each element sits, what role it plays, and how confident the model is in each region.<br>This structure supports several downstream workloads:<br>Semantic chunking for RAG : clean, classified blocks become better retrieval units.
Structural primitives for agents : agents move from reading documents to acting on them (form filling, invoice processing, compliance checks).
Structured content for connectors : consistent, typed output for ingestion and indexing pipelines.
OCR 4 accepts common enterprise formats, including PDF, DOC, PPT, and OpenDocument, and supports 170 languages across 10 language groups, including rare and low-resource languages that many systems handle poorly. As a compact model deployable in a single container, it is suited to both cost-sensitive and high-volume deployments. It can run fully self-hosted, allowing organizations with data-sovereignty requirements to keep document data within their own infrastructure.<br>Developers integrate the model via API, and teams can use Document AI in Mistral...