The Nonlinear Library

AF - Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training by Evan Hubinger

Jan 12, 2024
Ask episode
Chapters
Transcript
Episode notes