Software Engineering Daily

Optimizing Agent Behavior in Production with Gideon Mendels

31 snips
Feb 17, 2026
Gideon Mendels, Co-founder and CEO of Comet who built ML systems at Google, talks about building and evaluating LLM-powered agents. He covers why non-determinism breaks traditional testing. He explains evals as test suites, bootstrapping regressions from production, and treating prompts, tools, and configs as optimization/search problems.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

From Software Engineer To MLOps Founder

  • Gideon described moving from software engineering into ML and seeing chaotic ML workflows at Google that inspired Comet.
  • He and his co-founder built experiment tracking in 2017–18 which later evolved to support LLM-driven agent workflows with OPIC in 2024.
INSIGHT

Agents Are A Hybrid Engineering Domain

  • Agent development sits between traditional software engineering and ML because builders control prompts, tools, and context rather than weights.
  • This hybrid nature requires different SDKs, UIs, and operational patterns than classic MLOps.
ADVICE

Bootstrap Evals From Real Failures

  • Build evaluation suites (evals) mapping inputs to expected outputs and scoring distance between them.
  • Use human subject-matter experts and product workflows to bootstrap regression tests from real production failures.
Get the Snipd Podcast app to discover more snips from this episode
Get the app