Cursor

John Schulman on dead ends, scaling RL, and building research institutions

12 snips
Dec 17, 2025
Join John Schulman, co-founder and leading researcher in reinforcement learning, as he delves into the early days of OpenAI and the potential for rapid advancements in AI like ChatGPT. He discusses the evolution of research management styles and the importance of effective team structures in fostering innovation. Schulman also tackles pressing issues like the future of reinforcement learning, the impact of continual learning, and how AI tools enhance daily researcher workflows. Plus, his insights on AGI timelines are sure to spark debate!
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Labs Learn More From Lineage Than Legends

  • OpenAI's culture drew more from places people previously worked than deliberate emulation of Bell Labs or PARC.
  • Early days resembled grad school and Google/DeepMind lineages more than historic industrial labs.
INSIGHT

Value Functions Are Underused Now

  • Value functions currently offer little benefit on modern RL-from-human-feedback tasks despite their variance-reduction role.
  • Schulman expects value functions to return to importance in future regimes.
INSIGHT

Short Contexts Versus Long-Term Weights

  • Continual learning likely needs a mix: better context methods for short horizons and weight updates (fine-tuning) for long-term memory.
  • Different memory types (episodic, motor, procedural) will favor different techniques.
Get the Snipd Podcast app to discover more snips from this episode
Get the app