Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead end

3033 snips
Sep 26, 2025
Richard Sutton, a leading researcher in reinforcement learning and 2024 Turing Award winner, argues that large language models (LLMs) are a dead end. He believes LLMs can't learn on-the-job and emphasizes the need for a new architecture enabling continual learning like animals do. The discussion touches on how LLMs perform imitation instead of genuine experiential learning, and why instilling goals is vital for intelligence. Sutton critiques the predictive nature of LLMs, advocating for a future where AI learns from real-world interactions rather than fixed datasets.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Goals Are Central To Intelligence

  • Sutton defines the scalable method as learning from experience with explicit goals to judge right and wrong.
  • He says LLMs start in the wrong place by lacking goals and continual learning mechanisms.
INSIGHT

Animals Learn By Prediction, Not Supervision

  • Sutton argues supervised imitation isn't the basic learning mechanism in animals; instead trial-and-error prediction drives learning.
  • He views formal schooling as an exception, not the natural template for intelligence.
INSIGHT

Knowledge Is About The Experience Stream

  • Sutton formalizes the experiential paradigm: continuous sensation, action, and reward form the stream agents learn from.
  • Knowledge is statements about that stream and can be tested and updated continually.
Get the Snipd Podcast app to discover more snips from this episode
Get the app