The Nonlinear Library

LW - Polysemantic Attention Head in a 4-Layer Transformer by Jett

Nov 9, 2023
Ask episode
Chapters
Transcript
Episode notes