NVIDIA Nemotron 3 Super - NVIDIA Nemotron
You are using an outdated browser. Please upgrade your browser to improve your experience.
NVIDIA
Follow
Models Super Tech Report Nemotron 3 Blog
We are releasing NVIDIA Nemotron 3 Super, a 12B active 120B total parameter Mixture-of-Experts hybrid Mamba-Transformer model. Nemotron 3 Super is part of the Nemotron 3 series of models , and is the first model in the series that:
Leverages LatentMoE for improved accuracy.
Includes MTP layers for faster inference through native speculative decoding.
Is Pretrained in NVFP4 .
Key Highlights
Nemotron 3 Super achieves upto 2.2x and 7.5x higher inference throughput than GPT-OSS-120B and Qwen3.5-122B , respectively, on the 8k token input / 64k token output setting.
Nemotron 3 Super achieves higher or comparable accuracies to GPT-OSS-120B and Qwen3.5-122B across a diverse set of benchmarks.
Supports context length of up to 1M tokens while outperforming both GPT-OSS-120B and Qwen3.5-122B on RULER at 1M context length.
Open Source
We are releasing the pre-trained, post-trained, and quantized checkpoints along with the datasets used for training.
Checkpoints:
Nemotron 3 Super 120B-A12B NVFP4 : post-trained and NVFP4 quantized model
Nemotron 3 Super 120B-A12B FP8 : post-trained and FP8 quantized model
Nemotron 3 Super 120B-A12B BF16 : post-trained model
Nemotron 3 Super 120B-A12B Base BF16 : base model
Qwen3-Nemotron-235B-A22B-GenRM-2603 : GenRM used for RLHF
Data:
Nemotron-Pretraining-Specialized-v1.1 : a collection of synthetic datasets aimed to improve LLM capabilities in code concepts and algorithms, formal logic, economics, and multiple choice questions.
Nemotron-Super-Post-Training-Data : a collection of RL environments and SFT datasets targeting a broad range of agentic capabilities.
Model Recipes:
NVIDIA Nemotron Developer Repository
Tech Report
More technical details in the Tech Report
Share on