Scaling Laws: Claude's Constitution, with Amanda Askell

20 snips

Feb 20, 2026

Amanda Askell, head of personality alignment at Anthropic and primary author of Claude's Constitution, explains the 20,000-word framework that shapes Claude's values and behavior. She describes how the constitution guides training and reward signals. The conversation covers fidelity to text versus spirit, virtue ethics over rigid rules, cultural universality, decision hierarchies, and implications for moral patienthood and specialized domains.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Training Over Literal Fidelity

Fidelity to the constitution is enforced by steering models during training rather than by hard textual enforcement at runtime.
Anthropic expects to publish constitution changes when retraining shifts model values to preserve transparency.

INSIGHT

Building Case Law For Edge Cases

Anthropic imagines a 'case law' of difficult examples to clarify how constitutional values apply in edge cases.
These illustrative cases would guide consistent interpretation and future training decisions.

ADVICE

Centralize Value Trade-Offs

Maintain coherence by centralizing trade-off decisions rather than letting many teams impose local rules.
Use a small group to ensure consistent values across domains and avoid a fractured model.

Get the Snipd Podcast app to discover more snips from this episode

Get the app