
Illustrating Reinforcement Learning from Human Feedback (RLHF)
BlueDot Narrated
00:00
ChatGPT's simple explanation and gaps
Perrin Walker reads ChatGPT's child-friendly RLHF analogy and notes missing technical details.
Play episode from 02:00
Transcript


