Latent Space: The AI Engineer Podcast

[LIVE] Anthropic Distillation & How Models Cheat (SWE-Bench Dead) | Nathan Lambert & Sebastian Raschka

428 snips
Feb 26, 2026
Swyx, AI writer and commentator, and Sebastian Raschka, ML professor specializing in interpretability, join to dissect distillation, benchmarks, and model memorization. They debate detection limits for API-based distillation, teacher-student dynamics, and logits vs open-weight approaches. Sweebench and its vulnerabilities to leakage and curation issues are also explored.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

What Distillation Really Means For LLMs

  • Distillation means training a smaller model on a larger model's outputs to gain efficiency and capability.
  • Speakers highlighted that distillation ranges from logits-based teacher-student training to simply using synthetic QA pairs from API outputs.
INSIGHT

APIs Ban Distillation But Enforcement Is New

  • API terms of service commonly prohibit using model outputs to train competitive models, but enforcement has been rare until recently.
  • Anthropic's post framed large-scale API data collection as an 'attack' after detecting distributed accounts hitting their API heavily.
INSIGHT

How Companies Might Detect Distillation

  • Distinguishing evaluation from distillation is hard because both involve large numbers of API calls producing answers.
  • Speakers said detection mainly relies on scale, repetitive patterns, and distribution of question types across many accounts.
Get the Snipd Podcast app to discover more snips from this episode
Get the app