AI + a16z

Evals, Feedback Loops, and the Engineering That Makes AI Work

107 snips
Feb 17, 2026
Ankur Goyal, founder and CEO of Braintrust and former databases/AI products engineer, joins to talk engineering, evals, and productionizing models. He explains what evals do and why feedback loops matter. They debate systems vs. scaling mindsets, compare SQL vs. Bash agent designs, and unpack open vs. closed model cycles and Chinese model dynamics.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Evals Are Scientific Engineering

  • Evals are the scientific method for non-deterministic AI systems and make hypotheses testable.
  • Ankur Goyal says qualitative review plus quantitative metrics refines evals over time.
ADVICE

Design Agents To Be Disposable

  • Build agents and context layers expecting to throw them away tomorrow for flexibility.
  • Engineer strong feedback loops from production to tests to know where models actually fail.
INSIGHT

When Models Stall, Engineering Wins

  • Model quality improvements slow unpredictably, creating opportunities to engineer efficiency.
  • When you can't make the model 1% smarter, engineering can make it far more efficient.
Get the Snipd Podcast app to discover more snips from this episode
Get the app