
What is AI Alignment?
BlueDot Narrated
00:00
Outer alignment: reward misspecification
Perrin Walker describes reward misspecification and gives examples like LLMs rewarded for convincing answers.
Play episode from 12:41
Transcript

Perrin Walker describes reward misspecification and gives examples like LLMs rewarded for convincing answers.