Software Engineering Daily cover image

Optimizing Agent Behavior in Production with Gideon Mendels

Software Engineering Daily

00:00

Evals as test suites for agents

Gideon defines evals, their power, and why building reliable evaluation datasets is hard but crucial for production agents.

Play episode from 10:56
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app