
Scaling Laws Claude's Constitution, with Amanda Askell
10 snips
Feb 20, 2026 Amanda Askell, researcher leading Anthropic’s personality alignment team and primary author of Claude’s Constitution. She explains the Constitution as a training guide for values and behavior. Methods covered include supervised learning and RL signals. Discussion touches on enforcement, living-document updates, courageability vs. human judgment, cultural adaptation, instruction hierarchies, and ethics of personhood.
AI Snips
Chapters
Transcript
Episode notes
Constitution As Training And Transparency
- Claude's Constitution serves both as a transparency document and as training context to shape model behavior.
- Anthropic trains Claude to reason with the Constitution during supervised and reinforcement learning to create reward signals aligned with its values.
Feed Models Rich Context During Training
- Use the Constitution in both supervised learning and RL to generate training data and reward signals informed by values.
- Let capable models ingest more context to improve alignment instead of hiding high-level guidance.
Steering Toward Spirit Over Literal Rules
- Strict rule violations are rare; steering during training toward the Constitution's spirit matters more than policing literal breaches.
- Anthropic may publish case-law-like examples to clarify difficult trade-offs and ensure consistent interpretation.

