The Neuron: AI Explained

Kari Briski at GTC 2026: The Future of NVIDIA AI & Nemotron 3

56 snips

Mar 19, 2026

Kari Briski, VP of Generative AI Software for Enterprise at NVIDIA, leads the Nemotron model family. She unveils Nemotron 3 Super and explains how a 120B model can perform like a 12B through routing and efficiency. Conversation covers multi-agent systems moving to production, OpenClaw as a system-level security and orchestration approach, and the rapid growth of open-model token use.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Super Balances Smarts With Token Economics

Nemotron 3 Super was designed both to boost intelligence and to make multi-agent systems economical by improving token efficiency and latency.
Briski framed the model as a response to rising agent-to-agent token usage and the need for smarter, cheaper token generation in production.

INSIGHT

Architecture And Co-Design Deliver 3x Speedups

Architectural choices—hybrid Mamba state space plus Transformer, latent MOE and compounds—produce 3–5x latency improvements.
Briski called this extreme co-design: matching model architecture to infrastructure for smaller compute footprint and faster token throughput.

INSIGHT

Failures Often Come From The Harness Not The Model

Production failures are often about memory management and harnesses rather than just model instruction following.
Briski emphasized evaluating the total system—orchestrator, prompting, memory and tools—when diagnosing context failures.

Get the Snipd Podcast app to discover more snips from this episode