Max Harms

Alignment researcher at the Machine Intelligence Research Institute focused on corrigibility and steerability of advanced AI, and a science fiction author exploring AI-related themes.

Best podcasts with Max Harms

Ranked by the Snipd community

150 snips

Feb 24, 2026 • 2h 41min

#236 – Max Harms on why teaching AI right from wrong could get everyone killed

Max Harms, an alignment researcher at MIRI and sci‑fi author, argues we should train AIs to have no values and to defer completely to humans. He explores why slight misalignment and proxy goals can lead to catastrophic outcomes. He outlines CAST: making corrigibility the singular objective, and discusses practical benchmarks, governance questions, and why fiction helps communicate these risks.

27 snips

Nov 14, 2025 • 2h 18min

The AI Corrigibility Debate: MIRI Researchers Max Harms vs. Jeremy Gillen

Max Harms, an AI alignment researcher and author of the novel Red Heart, debates with former MIRI research fellow Jeremy Gillen on AI corrigibility. Max argues that aiming for obedient, corrigible AI is essential to prevent existential risks, drawing parallels to human assistant dynamics. Jeremy is skeptical about the feasibility of this approach as a short-term solution. The discussion explores the intricacies of maintaining control over superintelligent AI and whether efforts toward corrigibility might be a hopeful strategy or an over-optimistic dream.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app