
Cursor John Schulman on dead ends, scaling RL, and building research institutions
12 snips
Dec 17, 2025 Join John Schulman, co-founder and leading researcher in reinforcement learning, as he delves into the early days of OpenAI and the potential for rapid advancements in AI like ChatGPT. He discusses the evolution of research management styles and the importance of effective team structures in fostering innovation. Schulman also tackles pressing issues like the future of reinforcement learning, the impact of continual learning, and how AI tools enhance daily researcher workflows. Plus, his insights on AGI timelines are sure to spark debate!
AI Snips
Chapters
Transcript
Episode notes
Labs Learn More From Lineage Than Legends
- OpenAI's culture drew more from places people previously worked than deliberate emulation of Bell Labs or PARC.
- Early days resembled grad school and Google/DeepMind lineages more than historic industrial labs.
Value Functions Are Underused Now
- Value functions currently offer little benefit on modern RL-from-human-feedback tasks despite their variance-reduction role.
- Schulman expects value functions to return to importance in future regimes.
Short Contexts Versus Long-Term Weights
- Continual learning likely needs a mix: better context methods for short horizons and weight updates (fine-tuning) for long-term memory.
- Different memory types (episodic, motor, procedural) will favor different techniques.

