Nvidia Nemotron 3 Ultra

NVIDIA Nemotron 3 Ultra - NVIDIA Nemotron

You are using an outdated browser. Please upgrade your browser to improve your experience.

NVIDIA

Models Ultra Tech Report Nemotron 3 Blog

We present our most capable model yet – Nemotron 3 Ultra with 550 billion total and 55 billion active parameters. Nemotron 3 Ultra is the final and best model of the Nemotron 3 family of models .

Key Features

Employs Mixture-of-Experts Hybrid Mamba-Attention architecture .

Leverages LatentMoE for improved accuracy.

Includes MTP layers for faster inference through native speculative decoding.

Supports inference time reasoning budget control .

Pretrained in NVFP4 .

Post-trained with enhanced pipeline involving Supervised Fine Tuning (SFT) , Reinforcement Learning (RL) , and Multi-teacher On-Policy Distillation (MOPD) for improved model accuracy.

Key Highlights

Nemotron 3 Ultra achieves 5.9x, 4.8x, and 1.6x higher inference throughput compared to GLM-5.1-754B-A40B, Kimi-K2.6-1T-A32B, and Qwen-3.5-397B-17B respectively on the 8k token input / 64k token output setting.

Nemotron 3 Ultra achieves on-par accuracies compared to other state-of-the-art open LLMs across a diverse set of benchmarks.

Supports context length of up to 1M tokens while outperforming state-of-the-art open LLMs on RULER at 1M context length.

Open Source

We are releasing the pre-trained, post-trained, and quantized checkpoints along with the datasets used for training.

Checkpoints:

Nemotron 3 Ultra 550B-A55B NVFP4 : post-trained and NVFP4 quantized model

Nemotron 3 Ultra 550B-A55B BF16 : post-trained model

Nemotron 3 Ultra 550B-A55B Base BF16 : base model

Nemotron 3 Ultra 550B-A55B GenRM : GenRM used for RLHF

Data:

Nemotron-Pretraining-Code-v3 : 173B tokens of fresh code data from GitHub through September 30, 2025.

Nemotron-Pretraining-Legal-v1 : A collection of synthetic datasets intended to improve the legal capabilities of LLMs.

Nemotron-Pretraining-Specialized-v1.2 : A collection of synthetic datasets aimed to improve LLM capabilities on factual recall, moral scenarios, and diverse generative and multiple choice questions.

Nemotron-Posttraining-v3 : A collection of post-training datasets for improving agentic, reasoning, and general model capabilities during SFT and RL.

Nvidia Nemotron 3 Ultra

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

It's Not Just X. It's Y

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy