
Don't Worry About the Vase Podcast Anthropic Responsible Scaling Policy v3: Dive Into The Details
Apr 3, 2026
A deep dive into Anthropic's Responsible Scaling Policy v3.0, focusing on how the new rules work and what practical levers remain. Discussions cover risk-report cadences, sabotage and insider-threat models, and capability thresholds for chemical and biological dangers. The conversation also examines removed safety checks, roadmap timelines, and who gets to decide what counts as a "strong argument."
AI Snips
Chapters
Transcript
Episode notes
Risk Reports Are Regular But Not Blocking
- Risk reports will be published every 3–6 months and are additive to model release assessments but won't automatically block releases.
- Zvi notes this cadence may be too slow given model release cycles and that diffs are required within 30 days of new internal models.
New Risk Categories Add Sabotage And CBRN Distinctions
- Anthropic adds new categories: CBRN-3/4 for biological risks and 'high-stakes sabotage' focused on internal threats to future model alignment.
- Zvi praises the focus but warns the requirements are vague without concrete enforcement.
Automated R&D Threshold Is High And Missing Intermediates
- 'Automated R&D' is operationalized as models that can compress two years of 2018–2024 AI progress into one year, triggering stringent mitigations.
- Zvi thinks the threshold is high and missing intermediate triggers for earlier risks.
