The AI in Business Podcast

Why Ensemble Architectures Win Against Real-Time Voice Risk - with Mike Pappas of Modulate

12 snips
Mar 20, 2026
Mike Pappas, Co‑founder and CEO at Modulate, builds audio-native voice intelligence for real-time fraud and deepfake detection. He explains why live voice needs specialized, multi-model listening to catch social engineering and adversarial audio signals. The conversation covers ensemble audio models, where they outperform text systems, and how to evaluate voice-AI by speed, accuracy, and adaptability.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Catching Fraud In The Act Prevents The Largest Harms

  • Real-time detection matters because the worst harms occur when fraud is only noticed after funds are gone.
  • Mike Pappas explains prevention avoids immediate losses, regulatory exposure, and long-term loss of customer trust that post hoc detection cannot repair.
INSIGHT

Prevention Steps Can Introduce Costly Friction

  • Preventative measures can create hidden costs like longer call flows, regulatory exposure from biometric storage, and increased staffing needs.
  • Mike describes how adding voice ID steps can add minutes of friction and new compliance risks that reduce throughput.
INSIGHT

LLMs Miss Voice Cues Critical To Fraud Detection

  • General-purpose LLMs struggle for fraud detection because they're sycophantic and text-native, losing vocal nuance and adversarial signals.
  • Mike notes voice contains cues (emotion, timbre, background) that transcripts alone cannot capture for adversarial detection.
Get the Snipd Podcast app to discover more snips from this episode
Get the app