
[LIVE] Anthropic Distillation & How Models Cheat (SWE-Bench Dead) | Nathan Lambert & Sebastian Raschka
Latent Space: The AI Engineer Podcast
00:00
Introducing Sweebench and Its Purpose
swyx defines Sweebench: a coding benchmark built from open-source issues to evaluate LLM coding capability.
Play episode from 28:38
Transcript


