undefined

Tom McGrath

Chief Scientist / CTO-level researcher at Goodfire specializing in mechanistic interpretability, loss-landscape analysis, and methods for shaping model behavior during and after training.

Top 3 podcasts with Tom McGrath

Ranked by the Snipd community
undefined
258 snips
May 29, 2025 • 1h 50min

Mechanistic Interpretability: Philosophy, Practice & Progress with Goodfire's Dan Balsam & Tom McGrath

In a thought-provoking discussion, Dan Balsam, CTO of Goodfire, and Tom McGrath, Chief Scientist, dive into the exciting world of mechanistic interpretability in AI. They analyze how understanding neural networks can spark breakthroughs in scientific discovery and creative domains. The pair tackle challenges in natural language processing and model debugging, drawing fascinating parallels with biology. Additionally, they underscore the importance of funding and innovative approaches in advancing AI explainability, paving the way for a more transparent future.
undefined
117 snips
Aug 17, 2024 • 1h 52min

Popular Mechanistic Interpretability: Goodfire Lights the Way to AI Safety

Dan Balsam, CTO of Goodfire with extensive startup engineering experience, and Tom McGrath, Chief Scientist focused on AI safety from DeepMind, dive into mechanistic interpretability. They explore the complexities of AI training, discussing advances like sparse autoencoders and the balance between model complexity and interpretability. The conversation also reveals how hierarchical structures in AI relate to human cognition, illustrating the need for collaborative efforts in navigating the evolving landscape of AI research and safety.
undefined
95 snips
Mar 5, 2026 • 1h 47min

Don't Fight Backprop: Goodfire's Vision for Intentional Design, w/ Dan Balsam & Tom McGrath

Tom McGrath, Goodfire chief scientist working on mechanistic interpretability and loss-landscape shaping, and Dan Balsam, Goodfire co-founder focused on monitoring and applied research, dive into intentional design. They explore geometry of latent manifolds, decomposing gradients into semantic parts, probe-based hallucination reduction and frozen-probe tricks, plus disentangling memorization vs reasoning and Alzheimer’s biomarker findings.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app