The Nonlinear Library

AF - Is scheming more likely in models trained to have long-term goals? (Sections 2.2.4.1-2.2.4.2 of "Scheming AIs") by Joe Carlsmith

Nov 30, 2023
Ask episode
Chapters
Transcript
Episode notes