
[LIVE] Anthropic Distillation & How Models Cheat (SWE-Bench Dead) | Nathan Lambert & Sebastian Raschka
Latent Space: The AI Engineer Podcast
00:00
Sweebench Verified and OpenAI's Curation
They explain OpenAI's effort to curate 500 high-quality Sweebench tasks with human vetting and auditing.
Play episode from 31:13
Transcript


