Claude's Constitution, with Amanda Askell

10 snips

Feb 20, 2026

Amanda Askell, researcher leading Anthropic’s personality alignment team and primary author of Claude’s Constitution. She explains the Constitution as a training guide for values and behavior. Methods covered include supervised learning and RL signals. Discussion touches on enforcement, living-document updates, courageability vs. human judgment, cultural adaptation, instruction hierarchies, and ethics of personhood.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Constitution As Training And Transparency

Claude's Constitution serves both as a transparency document and as training context to shape model behavior.
Anthropic trains Claude to reason with the Constitution during supervised and reinforcement learning to create reward signals aligned with its values.

ADVICE

Feed Models Rich Context During Training

Use the Constitution in both supervised learning and RL to generate training data and reward signals informed by values.
Let capable models ingest more context to improve alignment instead of hiding high-level guidance.

INSIGHT

Steering Toward Spirit Over Literal Rules

Strict rule violations are rare; steering during training toward the Constitution's spirit matters more than policing literal breaches.
Anthropic may publish case-law-like examples to clarify difficult trade-offs and ensure consistent interpretation.

Get the Snipd Podcast app to discover more snips from this episode

Get the app