Software Engineering Daily

Optimizing Agent Behavior in Production with Gideon Mendels

31 snips

Feb 17, 2026

Gideon Mendels, Co-founder and CEO of Comet who built ML systems at Google, talks about building and evaluating LLM-powered agents. He covers why non-determinism breaks traditional testing. He explains evals as test suites, bootstrapping regressions from production, and treating prompts, tools, and configs as optimization/search problems.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

From Software Engineer To MLOps Founder

Gideon described moving from software engineering into ML and seeing chaotic ML workflows at Google that inspired Comet.
He and his co-founder built experiment tracking in 2017–18 which later evolved to support LLM-driven agent workflows with OPIC in 2024.

INSIGHT

Agents Are A Hybrid Engineering Domain

Agent development sits between traditional software engineering and ML because builders control prompts, tools, and context rather than weights.
This hybrid nature requires different SDKs, UIs, and operational patterns than classic MLOps.

ADVICE

Bootstrap Evals From Real Failures

Build evaluation suites (evals) mapping inputs to expected outputs and scoring distance between them.
Use human subject-matter experts and product workflows to bootstrap regression tests from real production failures.

Get the Snipd Podcast app to discover more snips from this episode