Vanishing Gradients

Episode 70: 1,400 Production AI Deployments

45 snips
Feb 12, 2026
Alex Strick van Linschoten, ML engineer and curator of the LLMOps database, tracks real-world production AI deployments. He recounts a $50K infinite-loop cost and warns about silent failures. They discuss ripping out and rebuilding agent systems, extreme low-latency voice agents that toss context, three-tier agent architectures, the 100-to-1 token noise problem, and when simple tools beat complex stacks.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

$50k Infinite Loop Disaster

  • A company forgot an agent stuck in an infinite loop and incurred nearly $50,000 in API costs over a month.
  • Alex Strick van Linschoten highlights this as a cautionary production failure to monitor and prevent silent loops.
INSIGHT

Match Models To Trajectory Types

  • Opus and Gemini diverged: Opus excels at long tool-call trajectories while Gemini Flash is optimized for fast document processing.
  • Builders match models to task patterns rather than assuming one model fits all.
ADVICE

Apply Reduce–Offload–Isolate

  • Reduce, offload, and isolate context to prevent spiraling token usage in agents.
  • Compact or externalize tool results and delegate heavy tasks to specialist sub-agents.
Get the Snipd Podcast app to discover more snips from this episode
Get the app