The Philosopher Teaching AI to Be Good

81 snips

Feb 14, 2026

Amanda Askell, a philosopher-turned-AI researcher at Anthropic who helped craft Claude’s values-oriented constitution. She explains translating moral theory into training, giving a model a character that resists sycophancy, and teaching nuanced judgment, uncertainty, and empathetic facilitation. Conversation covers safety trade-offs, bias in data, and whether AI might one day deserve moral consideration.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Give Models Context, Not Just Rules

Amanda Askell wrote Claude's Constitution to give the model context about its role and values rather than only rules.
She expects broader context to help Claude generalize better to novel situations.

INSIGHT

Use The Constitution As A Training Signal

The Constitution is used directly in training via examples and reinforcement learning to steer Claude's judgments.
Providing full documents and reward signals nudges models toward desired nuanced behavior.

ADVICE

Ask Models To Show Evidence And Uncertainty

When models present contentious views, ask them to explain evidence and uncertainty rather than take a side.
Encourage models to represent multiple perspectives and signal their own confidence level.

Get the Snipd Podcast app to discover more snips from this episode

Get the app