The MAD Podcast with Matt Turck

Voice AI’s Big Moment: Why Everything Is Changing Now (ft. Neil Zeghidour, Gradium AI)

54 snips
Feb 19, 2026
Neil Zeghidour, AI researcher and CEO of Gradium AI (ex-DeepMind/Google, Meta), guides a tour of modern Voice AI. He explains why voice is finally natural, the shift from cascaded stacks to speech-to-speech and full-duplex, and the engineering tradeoffs of on-device, compact models. Topics include neural audio codecs, instant cloning, noisy multi-speaker challenges, and how small teams can build production-grade voice systems.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Naturalness Is Timing And Emotion

  • Naturalness includes timing dynamics and appropriate emotion, not just low latency.
  • Full-duplex interaction (always listening/speaking) removes turn-taking latency and makes conversation fluid.
ADVICE

Prioritize Compact Models For Scale

  • Focus on compact, targeted speech models rather than huge multimodal behemoths to run voice at scale.
  • Prioritize efficiency and small-model design to enable on-device use and sustainable economics.
ADVICE

Trust Human Listening Over Metrics

  • Use human listening tests as the primary evaluation for audio quality instead of automatic proxies.
  • Run frequent blind tests because automated metrics fail on real-world audio and subjective experience matters most.
Get the Snipd Podcast app to discover more snips from this episode
Get the app