Nvidia Drops Nemotron 3 Super Amid $26 Billion Open-Model AI Bet—America's Answer to Qwen?

Summary

Nvidia released Nemotron 3 Super, a 120-billion-parameter open-weight AI model designed for compute-efficient autonomous AI agents. Using a mixture-of-experts (MoE) architecture, the model activates 12 billion parameters at once, supporting complex reasoning at lower costs. It features a 1-million-token context window and combines Mamba-2 state-space layers, Transformer attention layers, and a new “Latent MoE” design, enabling efficient long-context handling and increased expert activation at the same compute cost. The model is pretrained natively in Nvidia's 4-bit NVFP4 format, maintaining accuracy despite low precision and delivering over five times higher throughput than its predecessor. In benchmarks, Nemotron 3 Super is 2.2x faster than OpenAI's GPT-OSS 120B and 7.5x faster than Alibaba's Qwen3.5-122B. Nvidia has made the model and training pipeline fully public. This release is part of a $26 billion strategy to expand open-weight models, motivated partly by the rapid rise of open-source AI in China and the risk of Chinese AI models and hardware forming an ecosystem independent from Nvidia's infrastructure.