Latent Space: The AI Engineer Podcast cover image

METR’s Joel Becker on exponential Time Horizon Evals, Threat Models, and the Limits of AI Productivity

Latent Space: The AI Engineer Podcast

00:00

Harnesses, Scaffolding, and Overfitting Evaluations

Joel explains harness design at METR, scaffolding's value for performance, and overfit risks across dev/test sets.

Play episode from 48:30
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app