Mellum2 Goes Open Source: A Fast Model for AI Workflows

Mellum2 Goes Open Source: A Fast Model for AI Workflows | The JetBrains AI Blog

JetBrains AI

Supercharge your tools with AI-powered features inside many JetBrains products

Follow:

RSS RSS

Explore More

News Releases Mellum2 Goes Open Source: A Fast Model for AI Workflows

Anton Semenkin Nikita Pavlichenko

Trained from scratch and designed for practical deployment, Mellum2 is built for routing, Q&A, sub-agents, and private AI use in software engineering systems.

Today, we’re open-sourcing Mellum2, a 12B model engineered to solve the hardest parts of production AI: latency, throughput, and cost. Built from scratch and released under the Apache 2.0 license, Mellum2 offers a high-performance, cost-efficient alternative for your infrastructure.

Mellum began with code completion; now we’ve evolved it to handle both natural language and code. It is now a versatile tool ready to power routing, summarization, and intermediate reasoning steps across your modern AI workflows.

Whether you want to experiment, fine-tune, or deploy at scale, Mellum2 is ready to run in your own systems.

Try Mellum

Architecture and performance

Mellum2 is engineered to solve the bottlenecks of production-scale systems through its architecture and focused, efficiency-driven design.

Mixture-of-Experts (MoE) design: The model features 12B total parameters, but because it uses a MoE design, only 2.5B parameters are active per token. This reduces compute costs while enabling high-throughput, low-latency inference for real-time workloads.

Specialized focus: Unlike many modern models, Mellum2 is not multimodal. It is trained specifically on natural language and code data. This specialization ensures the model excels in software engineering environments while remaining lean and fast.

In our technical report, we detail our model’s performance across code generation, science, math, and reasoning benchmarks. Mellum2 is competitive with other similar-sized models while cutting inference time to less than half – a definitive advantage for production-grade deployments.

Key use cases for Mellum2

Route and orchestrate AI workloads: Use Mellum2 to analyze incoming prompts and help select the right model or tool for each task.

Build low-latency RAG pipelines: Retrieve relevant context, use Mellum2 to summarize it, and generate responses instantly.

Power fast sub-agents in complex workflows: Break down agent pipelines into steps like context gathering, planning, and validation. Use Mellum2 for fast, specialized tasks instead of relying on a single large model.

Enable private, local AI deployment: Run Mellum2 locally or self-host it to keep code and data fully under your control.

The "focal model" philosophy: Why focused models scale better

As AI systems become more complex, performance bottlenecks shift from raw capability to latency, throughput, and cost at scale. Not every task requires the largest model. Many steps in modern AI systems are repetitive, latency-sensitive, and high-frequency. These steps benefit from a fast and reliable model that can be efficiently routed, hosted, and controlled.

At JetBrains, we believe the future belongs to coordinated systems, not single models. Frontier models will continue to push the limits, but practical AI products also require focal models: fast, specialized components that handle high-frequency tasks efficiently.

That’s the role we see for Mellum2 in the next generation of AI software tooling.

Get started with Mellum2

If you’re building AI systems for software engineering – whether inside an IDE, in a RAG pipeline, as part of an agent workflow, or fully on your own infrastructure – we’d love for you to try Mellum2.

Open source is how better tools get made.

Try Mellum

AI Mellum open source

Facebook

Twitter

Prev post How We Use AlphaEvolve to Make Complex IDE Algorithms Faster

Subscribe to JetBrains AI Blog updates

Subscribe form

By submitting this form, I agree to the JetBrains Privacy Policy Notification icon

By submitting this form, I agree that JetBrains s.r.o. ("JetBrains") may use my name, email address, and location data to send me newsletters, including commercial communications, and to process my personal data for this purpose. I agree that JetBrains may process said data using third-party services for this purpose in accordance with the JetBrains Privacy Policy. I understand that I can revoke this consent at any time in my profile. In addition, an unsubscribe link is included in each email.

Submit

Thanks, we've got you!

Discover more

Experimental AI Features for JetBrains IDEs: Recap and Insights

Introducing recap and insights, two experimental AI features that proactively help you understand recent activity and non-obvious code in your project.

Anna Maltseva

Cursor Joined the ACP Registry and Is Now Live in Your JetBrains IDE

Cursor is now available as an AI agent inside JetBrains IDEs through the Agent Client Protocol.

Jan-Niklas...

Mellum2 Goes Open Source: A Fast Model for AI Workflows

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

It's Not Just X. It's Y

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy