
The Attention Mechanism with Andrew Mayne A 10-Speed Bicycle Behind a Tractor Trailer
22 snips
Feb 24, 2026 They dig into claims that Chinese labs may be training models by copying other models' outputs and why that matters. They debate Anthropic's fraught deal-making with the Pentagon and the ethics of safety limits. Local fights over data centers and how towns might negotiate benefits get unpacked. One host recounts building an iOS app with Codex and explores practical AI automations.
AI Snips
Chapters
Transcript
Episode notes
Model Distillation Explains Rapid Chinese Progress
- Distillation lets smaller models learn by copying outputs from larger teacher models and produces fast, cheap student models.
- Andrew Mayne likened Chinese labs' gains to pedaling a 10-speed bicycle behind a tractor trailer: speeds fall once the source stops feeding outputs.
Anthropic Called Out Specific Labs For Data Exfiltration
- Anthropic accused specific Chinese labs (DeepSeek, Minimax) of copying Anthropic/OpenAI outputs and violating API terms to train models.
- Mayne framed this as a technical, not merely ethical, problem because copied outputs create shortcut high-quality training data.
Vet Red Teamers To Protect Model Secrets
- Be cautious when using external red-teamers or third-party researchers early in model development.
- Andrew Mayne warns these partners can be a major vector for secrets exfiltration and create trust collapses if mishandled.
