
[LIVE] Anthropic Distillation & How Models Cheat (SWE-Bench Dead) | Nathan Lambert & Sebastian Raschka
Latent Space: The AI Engineer Podcast
00:00
Future of Benchmarking and Cost
Nathan warns frontier evals will grow costly; coding is easier to evaluate, while UI and system tests remain harder.
Play episode from 47:34
Transcript


