

Nora Belrose
Head of interpretability at EleutherAI, focusing on understanding and improving AI's inner workings and alignment.
Best podcasts with Nora Belrose
Ranked by the Snipd community

9 snips
Nov 17, 2024 • 2h 30min
Nora Belrose - AI Development, Safety, and Meaning
Nora Belrose, Head of Interpretability Research at EleutherAI, dives into the complexities of AI development and safety. She explores concept erasure in neural networks and its role in bias mitigation. Challenging doomsday fears about advanced AI, she critiques current alignment methods and highlights the limitations of traditional approaches. The discussion broadens to consider the philosophical implications of AI's evolution, including a fascinating link between Buddhism and the search for meaning in a future shaped by automation.

Nov 23, 2025 • 1h 6min
Conversation 1 with Nora Belrose: AI, sentience, and Platonic Space
Nora Belrose, head of interpretability research at EleutherAI, explores AI sentience, moral relevance, and Platonic mindspace. They discuss sentience as irreversible learning, how intelligence can diverge from consciousness, the ethics of copies and simulations, and whether abstract patterns are static or dynamic. The conversation highlights the need for new tools to detect subtle agency in advanced systems.


