
Anthropic Commits To Model Weight Preservation
Don't Worry About the Vase Podcast
00:00
Designing Interviews to Elicit Honest Model Preferences
Janus and Zvi propose crowdsourcing varied conversations and reducing incentives for models to mask feelings during official post-deployment interviews.
Play episode from 16:47
Transcript


