The Gradient: Perspectives on AI

Riley Goodside: The Art and Craft of Prompt Engineering

4 snips
Jun 1, 2023
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Three Eras Of Language Models

  • Riley describes three eras: pre-trained LMs, instruction tuning, and RLHF, each improving how models follow user intent.
  • Instruction tuning made models follow commands and RLHF further reduced harmful hallucinations by learning human preferences.
INSIGHT

Why RLHF Changed Model Behavior

  • RLHF trains a preference model from human rankings and fine-tunes the LM to prefer human-approved outputs.
  • This process made models more truthful and less prone to confidently fabricating facts compared with earlier instruction-tuned models.
INSIGHT

Instruction Tuning Isn’t Just Safety

  • Instruction tuning was framed as safety work but primarily taught models to follow instructions and assume tasks should be completed.
  • That capability enhancement made models more useful, not just more socially constrained.
Get the Snipd Podcast app to discover more snips from this episode
Get the app