AI Summer cover image

Joel Becker on METR's famous time horizons chart

AI Summer

00:00

Holistic model evaluation beyond benchmarks

Joel values qualitative deployment anecdotes and hands-on testing to find where models remain 'derpy'.

Play episode from 33:52
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app