Leo Gao

Researcher and writer focused on mechanistic interpretability and AI alignment, author of the essay 'An Ambitious Vision for Interpretability' narrated in this episode.

Best podcasts with Leo Gao

Ranked by the Snipd community

Dec 6, 2025 • 9min

“An Ambitious Vision for Interpretability” by leogao

Leo Gao, a researcher in mechanistic interpretability and AI alignment, dives into the ambitious vision of fully understanding neural networks. He discusses why mechanistic understanding is crucial for effective debugging, allowing us to untangle complex behaviors like scheming. Gao shares insights on the progress made in circuit sparsity and challenges faced in the interpretability landscape. He envisions future advancements, suggesting that small interpretable models can provide insights for scaling up to larger models. Expect thought-provoking ideas on enhancing AI transparency!

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app