Latent Space: The AI Engineer Podcast cover image

METR’s Joel Becker on exponential Time Horizon Evals, Threat Models, and the Limits of AI Productivity

Latent Space: The AI Engineer Podcast

00:00

Benchmarks for Self-Improvement and AR&D Automation

Joel surveys benchmarks like PaperBench and Rebench and their relation to automating R&D tasks and failures.

Play episode from 26:58
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app