Skip to main content
AI & Machine Learning 2 min read 541 views

NVIDIA Releases Nemotron 3 Super: Open 120B-Parameter Model Targets Enterprise Agentic AI

NVIDIA has released Nemotron 3 Super, a 120-billion-parameter open-weights model built on a hybrid Mamba-Transformer architecture with a one-million-token context window. The model delivers 5x throughput improvements over its predecessor and is designed specifically for enterprise agentic AI workflows.

TD

TechDrop Editorial

Share:

NVIDIA has released Nemotron 3 Super, a 120-billion-parameter open-weights model that represents a significant architectural departure from conventional transformer-only designs. Built on a hybrid Mamba-Transformer mixture-of-experts architecture with only 12 billion active parameters at inference time, the model delivers 5x throughput improvements over NVIDIA's previous-generation models while supporting a one-million-token context window.

Hybrid Architecture

Nemotron 3 Super combines the Mamba selective state space model with traditional transformer attention layers in a mixture-of-experts configuration. This hybrid approach allows the model to handle long-context tasks — such as analyzing entire codebases or processing lengthy documents — with significantly less compute than pure transformer models of comparable capability. The 12B active parameter count means that despite having 120B total parameters, each forward pass only activates a fraction of the model, dramatically reducing inference costs.

NVIDIA has published both the model weights and the 10-trillion-token training dataset, making Nemotron 3 Super one of the most transparent large-scale model releases to date. The open training data is particularly notable — most model providers treat training data composition as a closely guarded secret.

Enterprise Agentic AI Focus

The model is explicitly designed for enterprise agentic AI workflows — multi-step tasks where an AI agent needs to plan, execute, and iterate autonomously. NVIDIA has optimized Nemotron 3 Super for tool calling, structured output generation, and multi-turn reasoning, the core capabilities required for agents that can navigate complex enterprise systems.

The model integrates natively with NVIDIA's NIM (NVIDIA Inference Microservices) platform, allowing enterprises to deploy it on their own infrastructure without sending data to external APIs. This on-premises capability is critical for regulated industries — healthcare, finance, defense — where data sovereignty requirements make cloud-hosted AI models impractical.

The Nemotron 3 Family

Nemotron 3 Super sits in the middle of a three-model family. Nemotron 3 Nano (8B parameters) targets edge and mobile deployments, while Nemotron 3 Ultra (200B+ parameters) is designed for the most demanding research and enterprise workloads. All three models share the same hybrid architecture and are trained on the same data pipeline, ensuring behavioral consistency across deployment scales.

The release comes just days before NVIDIA's GTC 2026 conference, where the company is expected to provide deeper technical details on the Nemotron architecture and announce enterprise partnerships built around the model family.

Related Articles