[LIVE] Anthropic Distillation & How Models Cheat (SWE-Bench Dead) | Nathan Lambert & Sebastian Raschka

441 snips

Feb 26, 2026

Guest

Sebastian Raschka

Guest

swyx

Swyx, AI writer and commentator, and Sebastian Raschka, ML professor specializing in interpretability, join to dissect distillation, benchmarks, and model memorization. They debate detection limits for API-based distillation, teacher-student dynamics, and logits vs open-weight approaches. Sweebench and its vulnerabilities to leakage and curation issues are also explored.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

What Distillation Really Means For LLMs

Distillation means training a smaller model on a larger model's outputs to gain efficiency and capability.
Speakers highlighted that distillation ranges from logits-based teacher-student training to simply using synthetic QA pairs from API outputs.

INSIGHT

APIs Ban Distillation But Enforcement Is New

API terms of service commonly prohibit using model outputs to train competitive models, but enforcement has been rare until recently.
Anthropic's post framed large-scale API data collection as an 'attack' after detecting distributed accounts hitting their API heavily.

INSIGHT

How Companies Might Detect Distillation

Distinguishing evaluation from distillation is hard because both involve large numbers of API calls producing answers.
Speakers said detection mainly relies on scale, repetitive patterns, and distribution of question types across many accounts.

Get the Snipd Podcast app to discover more snips from this episode

Get the app