Mellum2 Goes Open Source: A Fast Model for AI Workflows | The JetBrains AI Blog
JetBrains AI
Supercharge your tools with AI-powered features inside many JetBrains products
Follow
Follow:
RSS RSS
Explore More
News<br>Releases<br>Mellum2 Goes Open Source: A Fast Model for AI Workflows
Anton Semenkin<br>Nikita Pavlichenko
Trained from scratch and designed for practical deployment, Mellum2 is built for routing, Q&A, sub-agents, and private AI use in software engineering systems.
Today, we’re open-sourcing Mellum2, a 12B model engineered to solve the hardest parts of production AI: latency, throughput, and cost. Built from scratch and released under the Apache 2.0 license, Mellum2 offers a high-performance, cost-efficient alternative for your infrastructure.
Mellum began with code completion; now we’ve evolved it to handle both natural language and code. It is now a versatile tool ready to power routing, summarization, and intermediate reasoning steps across your modern AI workflows.
Whether you want to experiment, fine-tune, or deploy at scale, Mellum2 is ready to run in your own systems.
Try Mellum
Architecture and performance
Mellum2 is engineered to solve the bottlenecks of production-scale systems through its architecture and focused, efficiency-driven design.
Mixture-of-Experts (MoE) design: The model features 12B total parameters, but because it uses a MoE design, only 2.5B parameters are active per token. This reduces compute costs while enabling high-throughput, low-latency inference for real-time workloads.
Specialized focus: Unlike many modern models, Mellum2 is not multimodal. It is trained specifically on natural language and code data. This specialization ensures the model excels in software engineering environments while remaining lean and fast.
In our technical report, we detail our model’s performance across code generation, science, math, and reasoning benchmarks. Mellum2 is competitive with other similar-sized models while cutting inference time to less than half – a definitive advantage for production-grade deployments.
Key use cases for Mellum2
Route and orchestrate AI workloads: Use Mellum2 to analyze incoming prompts and help select the right model or tool for each task.
Build low-latency RAG pipelines: Retrieve relevant context, use Mellum2 to summarize it, and generate responses instantly.
Power fast sub-agents in complex workflows: Break down agent pipelines into steps like context gathering, planning, and validation. Use Mellum2 for fast, specialized tasks instead of relying on a single large model.
Enable private, local AI deployment: Run Mellum2 locally or self-host it to keep code and data fully under your control.
The "focal model" philosophy: Why focused models scale better
As AI systems become more complex, performance bottlenecks shift from raw capability to latency, throughput, and cost at scale. Not every task requires the largest model. Many steps in modern AI systems are repetitive, latency-sensitive, and high-frequency. These steps benefit from a fast and reliable model that can be efficiently routed, hosted, and controlled.
At JetBrains, we believe the future belongs to coordinated systems, not single models. Frontier models will continue to push the limits, but practical AI products also require focal models: fast, specialized components that handle high-frequency tasks efficiently.
That’s the role we see for Mellum2 in the next generation of AI software tooling.
Get started with Mellum2
If you’re building AI systems for software engineering – whether inside an IDE, in a RAG pipeline, as part of an agent workflow, or fully on your own infrastructure – we’d love for you to try Mellum2.
Open source is how better tools get made.
Try Mellum
AI<br>Mellum<br>open source
Share
Prev post How We Use AlphaEvolve to Make Complex IDE Algorithms Faster
Subscribe to JetBrains AI Blog updates
Subscribe form
By submitting this form, I agree to the JetBrains Privacy Policy Notification icon
By submitting this form, I agree that JetBrains s.r.o. ("JetBrains") may use my name, email address, and location data to send me newsletters, including commercial communications, and to process my personal data for this purpose. I agree that JetBrains may process said data using third-party services for this purpose in accordance with the JetBrains Privacy Policy. I understand that I can revoke this consent at any time in my profile. In addition, an unsubscribe link is included in each email.
Submit
Thanks, we've got you!
Discover more
Experimental AI Features for JetBrains IDEs: Recap and Insights
Introducing recap and insights, two experimental AI features that proactively help you understand recent activity and non-obvious code in your project.
Anna Maltseva
Cursor Joined the ACP Registry and Is Now Live in Your JetBrains IDE
Cursor is now available as an AI agent inside JetBrains IDEs through the Agent Client Protocol.
Jan-Niklas...