The Neuron: AI Explained

Kari Briski at GTC 2026: The Future of NVIDIA AI & Nemotron 3

56 snips
Mar 19, 2026
Kari Briski, VP of Generative AI Software for Enterprise at NVIDIA, leads the Nemotron model family. She unveils Nemotron 3 Super and explains how a 120B model can perform like a 12B through routing and efficiency. Conversation covers multi-agent systems moving to production, OpenClaw as a system-level security and orchestration approach, and the rapid growth of open-model token use.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Super Balances Smarts With Token Economics

  • Nemotron 3 Super was designed both to boost intelligence and to make multi-agent systems economical by improving token efficiency and latency.
  • Briski framed the model as a response to rising agent-to-agent token usage and the need for smarter, cheaper token generation in production.
INSIGHT

Architecture And Co-Design Deliver 3x Speedups

  • Architectural choices—hybrid Mamba state space plus Transformer, latent MOE and compounds—produce 3–5x latency improvements.
  • Briski called this extreme co-design: matching model architecture to infrastructure for smaller compute footprint and faster token throughput.
INSIGHT

Failures Often Come From The Harness Not The Model

  • Production failures are often about memory management and harnesses rather than just model instruction following.
  • Briski emphasized evaluating the total system—orchestrator, prompting, memory and tools—when diagnosing context failures.
Get the Snipd Podcast app to discover more snips from this episode
Get the app