Evals, Feedback Loops, and the Engineering That Makes AI Work

107 snips

Feb 17, 2026

Ankur Goyal, founder and CEO of Braintrust and former databases/AI products engineer, joins to talk engineering, evals, and productionizing models. He explains what evals do and why feedback loops matter. They debate systems vs. scaling mindsets, compare SQL vs. Bash agent designs, and unpack open vs. closed model cycles and Chinese model dynamics.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Evals Are Scientific Engineering

Evals are the scientific method for non-deterministic AI systems and make hypotheses testable.
Ankur Goyal says qualitative review plus quantitative metrics refines evals over time.

ADVICE

Design Agents To Be Disposable

Build agents and context layers expecting to throw them away tomorrow for flexibility.
Engineer strong feedback loops from production to tests to know where models actually fail.

INSIGHT

When Models Stall, Engineering Wins

Model quality improvements slow unpredictably, creating opportunities to engineer efficiency.
When you can't make the model 1% smarter, engineering can make it far more efficient.

Get the Snipd Podcast app to discover more snips from this episode

Get the app