The Stack Overflow Podcast

Even your voice is a data problem

12 snips
Feb 13, 2026
Scott Stephenson, CEO and co-founder of Deepgram and former particle physicist turned voice-AI leader. He discusses tackling speech recognition for noisy, real-world audio. They cover data vs. input features, synthetic audio generation, scalable low-latency streaming on cloud, responsible limits on voice cloning, and modular AI architectures for connected voice agents.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Underground Physics Recordings Sparked Deepgram

  • Scott Stephenson described recording over a thousand hours of audio while running a deep underground physics experiment to preserve memories and highlights.
  • That dataset inspired building search and transcription tools because existing solutions couldn't find the "highlight reel."
INSIGHT

Narrow Scope Enabled Faster Product-Market Fit

  • Narrow initial product scope helped Deepgram achieve commercial traction by targeting English customer-service calls first.
  • Focusing on throughput and price-per-hour enabled adoption against incumbents like Nuance and IBM.
INSIGHT

End-To-End Models Reduce Loss And Cost

  • Deep end-to-end models remove lossy intermediate components and enable faster, cheaper speech recognition.
  • Modular architectures plus adaptation with small labeled data let models improve per-customer.
Get the Snipd Podcast app to discover more snips from this episode
Get the app