Episode 70: 1,400 Production AI Deployments

45 snips

Feb 12, 2026

Alex Strick van Linschoten, ML engineer and curator of the LLMOps database, tracks real-world production AI deployments. He recounts a $50K infinite-loop cost and warns about silent failures. They discuss ripping out and rebuilding agent systems, extreme low-latency voice agents that toss context, three-tier agent architectures, the 100-to-1 token noise problem, and when simple tools beat complex stacks.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

$50k Infinite Loop Disaster

A company forgot an agent stuck in an infinite loop and incurred nearly $50,000 in API costs over a month.
Alex Strick van Linschoten highlights this as a cautionary production failure to monitor and prevent silent loops.

INSIGHT

Match Models To Trajectory Types

Opus and Gemini diverged: Opus excels at long tool-call trajectories while Gemini Flash is optimized for fast document processing.
Builders match models to task patterns rather than assuming one model fits all.

ADVICE

Apply Reduce–Offload–Isolate

Reduce, offload, and isolate context to prevent spiraling token usage in agents.
Compact or externalize tool results and delegate heavy tasks to specialist sub-agents.

Get the Snipd Podcast app to discover more snips from this episode

Get the app