
Introduction to Mechanistic Interpretability
BlueDot Narrated
00:00
Sparse autoencoders for disentangling features
Perrin Walker explains sparse autoencoders and how they encourage monosemantic neurons for interpretability.
Play episode from 09:32
Transcript


