From Atari to ChatGPT: How AI Learned to Follow Instructions
Linear Digressions
00:00
Misalignment between objectives and helpfulness
Ben introduces the misalignment problem: predicting tokens versus being actually helpful to users.
Play episode from 03:07
Transcript


