Human preferences as a training signal

Ben traces the 2017 insight to use human pairwise preferences to teach agents what humans like.

Play episode from 05:28

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!