🔥 Nvidia's Nemotron 3 cuts AI agent costs with 120B MoE model

AlphaSignal··6 min read
Share𝕏in

AI Summary

Nvidia released Nemotron 3 Super, a 120B parameter MoE model with hybrid Mamba-Transformer architecture, targeting the exploding cost of agentic AI workflows that rely on frontier closed models like Claude Opus 4.6 at $25/million output tokens. The model is natively trained in NVFP4 format for Blackwell B200 GPUs, creating a hardware-software flywheel that drives Nvidia chip sales. Nvidia is reportedly investing $26B over five years in open-weight AI to build a software moat alongside CUDA and pressure proprietary labs like OpenAI and Anthropic.

Key Facts

  • Nemotron 3 Super is Nvidia's 120B MoE open-source model with a 1M token context window, scoring 36 on the Artificial Analysis Intelligence Index and delivering higher throughput on B200 GPUs than comparable open models.
  • Nvidia is investing $26B over five years in open-weight AI to create a hardware-software flywheel where model adoption drives Blackwell GPU sales via native NVFP4 co-design.
  • The open-source AI ecosystem has a leadership vacuum as Meta slows Llama releases, DeepSeek R2 faces training instability delays, and Alibaba's Qwen team loses key members.

Contrarian Angle

Nvidia Uses Open-Source AI as a Direct Hardware Revenue Driver

Unlike OpenAI and Mistral who release open weights for brand awareness and to upsell closed models, Nvidia's open-weight releases are co-designed with its silicon (NVFP4 on B200 GPUs), making peak performance only achievable on Nvidia hardware and directly driving chip sales.

Open-source is typically seen as giving away value; Nvidia monetizes it directly through hardware lock-in without a closed model tier

More from AlphaSignal

🔥 Nvidia's Nemotron 3 cuts AI agent costs with 120B MoE model — AlphaSignal | subtl