
Introduction to Mechanistic Interpretability
BlueDot Narrated
00:00
Anthropic's scaling breakthrough
Perrin Walker summarizes Anthropic's use of sparse autoencoders to identify millions of features and safety-relevant representations.
Play episode from 10:29
Transcript


