Anthropic Responsible Scaling Policy v3: Dive Into The Details

Apr 3, 2026

A deep dive into Anthropic's Responsible Scaling Policy v3.0, focusing on how the new rules work and what practical levers remain. Discussions cover risk-report cadences, sabotage and insider-threat models, and capability thresholds for chemical and biological dangers. The conversation also examines removed safety checks, roadmap timelines, and who gets to decide what counts as a "strong argument."

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Risk Reports Are Regular But Not Blocking

Risk reports will be published every 3–6 months and are additive to model release assessments but won't automatically block releases.
Zvi notes this cadence may be too slow given model release cycles and that diffs are required within 30 days of new internal models.

INSIGHT

New Risk Categories Add Sabotage And CBRN Distinctions

Anthropic adds new categories: CBRN-3/4 for biological risks and 'high-stakes sabotage' focused on internal threats to future model alignment.
Zvi praises the focus but warns the requirements are vague without concrete enforcement.

INSIGHT

Automated R&D Threshold Is High And Missing Intermediates

'Automated R&D' is operationalized as models that can compress two years of 2018–2024 AI progress into one year, triggering stringent mitigations.
Zvi thinks the threshold is high and missing intermediate triggers for earlier risks.

Get the Snipd Podcast app to discover more snips from this episode

Get the app