Richard Sutton – Father of RL thinks LLMs are a dead end

3135 snips

Sep 26, 2025

Richard Sutton, a leading researcher in reinforcement learning and 2024 Turing Award winner, argues that large language models (LLMs) are a dead end. He believes LLMs can't learn on-the-job and emphasizes the need for a new architecture enabling continual learning like animals do. The discussion touches on how LLMs perform imitation instead of genuine experiential learning, and why instilling goals is vital for intelligence. Sutton critiques the predictive nature of LLMs, advocating for a future where AI learns from real-world interactions rather than fixed datasets.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Goals Are Central To Intelligence

Sutton defines the scalable method as learning from experience with explicit goals to judge right and wrong.
He says LLMs start in the wrong place by lacking goals and continual learning mechanisms.

INSIGHT

Animals Learn By Prediction, Not Supervision

Sutton argues supervised imitation isn't the basic learning mechanism in animals; instead trial-and-error prediction drives learning.
He views formal schooling as an exception, not the natural template for intelligence.

INSIGHT

Knowledge Is About The Experience Stream

Sutton formalizes the experiential paradigm: continuous sensation, action, and reward form the stream agents learn from.
Knowledge is statements about that stream and can be tested and updated continually.

Get the Snipd Podcast app to discover more snips from this episode

Get the app

Richard Sutton is the father of reinforcement learning, winner of the 2024 Turing Award, and author of The Bitter Lesson. And he thinks LLMs are a dead end.

After interviewing him, my steel man of Richard’s position is this: LLMs aren’t capable of learning on-the-job, so no matter how much we scale, we’ll need some new architecture to enable continual learning.

And once we have it, we won’t need a special training phase — the agent will just learn on-the-fly, like all humans, and indeed, like all animals.

This new paradigm will render our current approach with LLMs obsolete.

In our interview, I did my best to represent the view that LLMs might function as the foundation on which experiential learning can happen… Some sparks flew.

A big thanks to the Alberta Machine Intelligence Institute for inviting me up to Edmonton and for letting me use their studio and equipment.

Enjoy!

Watch on YouTube; listen on Apple Podcasts or Spotify.