

Max Harms
Alignment researcher at the Machine Intelligence Research Institute focused on corrigibility and steerability of advanced AI, and a science fiction author exploring AI-related themes.
Best podcasts with Max Harms
Ranked by the Snipd community

150 snips
Feb 24, 2026 • 2h 41min
#236 – Max Harms on why teaching AI right from wrong could get everyone killed
Max Harms, an alignment researcher at MIRI and sci‑fi author, argues we should train AIs to have no values and to defer completely to humans. He explores why slight misalignment and proxy goals can lead to catastrophic outcomes. He outlines CAST: making corrigibility the singular objective, and discusses practical benchmarks, governance questions, and why fiction helps communicate these risks.

27 snips
Nov 14, 2025 • 2h 18min
The AI Corrigibility Debate: MIRI Researchers Max Harms vs. Jeremy Gillen
Max Harms, an AI alignment researcher and author of the novel Red Heart, debates with former MIRI research fellow Jeremy Gillen on AI corrigibility. Max argues that aiming for obedient, corrigible AI is essential to prevent existential risks, drawing parallels to human assistant dynamics. Jeremy is skeptical about the feasibility of this approach as a short-term solution. The discussion explores the intricacies of maintaining control over superintelligent AI and whether efforts toward corrigibility might be a hopeful strategy or an over-optimistic dream.


