The Nonlinear Library

LW - SAE feature geometry is outside the superposition hypothesis by jake mendel

Jun 24, 2024
Jake Mendel, author on LessWrong, discusses how SAE feature geometry goes beyond the superposition hypothesis, highlighting the importance of feature vectors' specific locations and rich structures. Understanding this geometry could lead to new theories or supplementing existing ones. The podcast explores the limitations of superposition-based interpretations and proposes alternative theories for neural network activation spaces.
Ask episode
Chapters
Transcript
Episode notes