The Lawfare Podcast

Scaling Laws: Claude's Constitution, with Amanda Askell

20 snips
Feb 20, 2026
Amanda Askell, head of personality alignment at Anthropic and primary author of Claude's Constitution, explains the 20,000-word framework that shapes Claude's values and behavior. She describes how the constitution guides training and reward signals. The conversation covers fidelity to text versus spirit, virtue ethics over rigid rules, cultural universality, decision hierarchies, and implications for moral patienthood and specialized domains.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Training Over Literal Fidelity

  • Fidelity to the constitution is enforced by steering models during training rather than by hard textual enforcement at runtime.
  • Anthropic expects to publish constitution changes when retraining shifts model values to preserve transparency.
INSIGHT

Building Case Law For Edge Cases

  • Anthropic imagines a 'case law' of difficult examples to clarify how constitutional values apply in edge cases.
  • These illustrative cases would guide consistent interpretation and future training decisions.
ADVICE

Centralize Value Trade-Offs

  • Maintain coherence by centralizing trade-off decisions rather than letting many teams impose local rules.
  • Use a small group to ensure consistent values across domains and avoid a fractured model.
Get the Snipd Podcast app to discover more snips from this episode
Get the app