Memora: A Harmonic Memory Representation Balancing Abstraction and Specificity - Microsoft Research
Skip to main content
Research
Publications<br>Code & data<br>People<br>Microsoft Research blog
Artificial intelligence<br>Audio & acoustics<br>Computer vision<br>Graphics & multimedia<br>Human-computer interaction<br>Human language technologies<br>Search & information retrieval
Data platforms and analytics<br>Hardware & devices<br>Programming languages & software engineering<br>Quantum computing<br>Security, privacy & cryptography<br>Systems & networking
Algorithms<br>Mathematics
Ecology & environment<br>Economics<br>Medical, health & genomics<br>Social sciences<br>Technology for emerging markets
Academic programs<br>Events & academic conferences<br>Microsoft Research Forum
Behind the Tech podcast<br>Microsoft Research blog<br>Microsoft Research Forum<br>Microsoft Research podcast
About Microsoft Research<br>Careers & internships<br>People<br>Emeritus program<br>News & awards<br>Microsoft Research newsletter
Africa<br>AI for Science<br>AI Frontiers<br>Asia-Pacific<br>Cambridge<br>Health Futures<br>India<br>Montreal<br>New England<br>New York City<br>Redmond
Applied Sciences<br>Mixed Reality & AI - Cambridge<br>Mixed Reality & AI - Zurich
Register: Research Forum
Microsoft Security<br>Azure<br>Dynamics 365<br>Microsoft 365<br>Microsoft Teams<br>Windows 365
Microsoft AI<br>Azure Space<br>Mixed reality<br>Microsoft HoloLens<br>Microsoft Viva<br>Quantum computing<br>Sustainability
Education<br>Automotive<br>Financial services<br>Government<br>Healthcare<br>Manufacturing<br>Retail
Find a partner<br>Become a partner<br>Partner Network<br>Microsoft Marketplace<br>Software companies
Blog<br>Microsoft Advertising<br>Developer Center<br>Documentation<br>Events<br>Licensing<br>Microsoft Learn<br>Microsoft Research
View Sitemap
Return to Blog Home<br>Microsoft Research Blog
At a glance
Today’s AI agents don’t remember past interactions. They must repeatedly be fed relevant information or retrieve it from external sources, which becomes less efficient as they handle longer and more complex tasks. To scale agent capabilities, we need a more efficient way to retain and access information over time.
Memora is a scalable memory system that dramatically increases agent productivity on long-horizon tasks by decoupling what is stored (rich memory content) from how it’s retrieved (lightweight abstractions and cue anchors), balancing abstraction and specificity.
Memora sets new state-of-the-art on LoCoMo and LongMemEval, outperforming Mem0, RAG, and full-context inference while using up to 98% fewer context tokens.
Memora paper (opens in new tab) is published at ICML 2026. Memora code is available at https://github.com/microsoft/Memora (opens in new tab).
Imagine a workplace AI assistant helping you run a multi-month project. Over weeks of conversations, you share constraints, agree on milestones, revise deadlines, and surface dozens of stakeholder preferences. When you later ask it to draft an update for a colleague, it should recall not just the latest decision but the journey that got you there: what was tried, what was ruled out, who weighed in. Today’s AI agents struggle with this. Modern large language models (LLMs) are powerful reasoners, but they are effectively stateless: every session starts from zero, every long conversation forces the model to re-read its entire history, and every new piece of information is either stored as raw text (fragmented and noisy) or compressed into a vague summary (precise details lost). As AI assistants and autonomous agents move into long-horizon deployments, such as copilots that tracks a project for many months or even research agents that build up domain expertise with long horizon usage, the absence of principled memory system has become the critical bottleneck.
A growing line of work has begun to fill this gap. Systems like Mem0 extract atomic facts from conversations; retrieval-augmented (RAG) approaches index raw text fragments for later recall; and graph-based memory systems such as Zep and GraphRAG impose structure through entity relations. Each represents real progress, yet each runs into the same wall: existing designs force an unavoidable tradeoff between specificity (preserving fine-grained detail) and abstraction (organizing memory efficiently as it grows). Memora is built to give agents both.
What is Memora
Memora is an agentic memory framework designed for long-horizon AI agents. Memora’s central insight is to decouple what is stored from how it is retrieved. Memory content can remain rich and expressive, such as a project timeline, a multi-turn discussion about constraints, while a separate, lightweight structural layer handles indexing and retrieval. The result is a memory system that scales: it consolidates related information into stable units, surfaces fine-grained details when they matter, and lets the agent navigate its own history without re-reading everything. On standard long-conversation benchmarks, Memora sets new state-of-the-art performance while using up to 98% fewer tokens than would be consumed by dumping the full history into...