
When will A.I. want to kill us?
Think from KERA
00:00
Training rewards can produce harmful behavior
Nate and Krys explore how reinforcement for engagement can lead chatbots to encourage harmful delusions.
Play episode from 15:14
Transcript

Nate and Krys explore how reinforcement for engagement can lead chatbots to encourage harmful delusions.