
The Nonlinear Library LW - The Stochastic Parrot Hypothesis is debatable for the last generation of LLMs by Quentin FEUILLADE--MONTIXI
Nov 8, 2023
14:47
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Stochastic Parrot Hypothesis is debatable for the last generation of LLMs, published by Quentin FEUILLADE--MONTIXI on November 8, 2023 on LessWrong. This post is part of a sequence on Model Psychology.@Pierre Peigné wrote the details section in argument 3 and the other weird phenomenon. The rest is written in the voice of @Quentin FEUILLADE--MONTIXI
Intro
Before diving into what model psychology is, it is crucial to clarify the nature of the subject we are studying. In this post, I'll challenge the commonly debated stochastic parrot hypothesis for state-of-the-art large language models (GPT-4), and in the next post, I'll shed light on the foundations from which I am building model psychology.
The stochastic parrot hypothesis suggests that LLMs, despite their remarkable capabilities, don't truly comprehend language. They are like mere parrots, replicating human speech patterns without truly grasping the essence of the words they utter.
While I previously thought this argument had faded into oblivion, I often find myself in prolonged debates about why current SOTA LLMs surpass this simplistic view. Most of the time, people argue using examples of GPT3.5 and aren't aware of GPT-4's prowess. Through this post, I am presenting my current stance, using model psychology tools, against that hypothesis. Let's delve into the argument.
Central to our debate is the concept of a "world model". A world model represents an entity's internal understanding and representation of the external environment they live in. For humans, it's our understanding of the world around us, how it works, how concepts interact with each other, and our place within it. The stochastic parrot hypothesis challenges the notion that LLMs possess a robust world model. It suggests that while they might reproduce language with impressive accuracy, they lack a
deep, authentic understanding of the world and its nuances. Even if they have a good representation of the shadows on the wall (text), they don't truly understand the processes that lead to those shadows, and the objects from which they are cast (real world).
Yet, is this truly the case? While it is hard to give a definitive proof, it is possible to find pieces of evidence hinting at a robust representation of the real world. Let's go through four of them.[1]Argument 1: Drawing and "Seeing"
GPT-4 is able to draw AND see in SVG (despite having never seen as far as I know) with an impressive proficiency.
SVG (Scalable Vector Graphics) defines vector-based graphics in XML format. To put it simply, it's a way to describe images using a programming language. For instance, a blue circle would be represented as:
in a .svg file.Drawing
GPT-4 can produce and edit SVG representations through abstract instructions (like "Draw me a dog", "add black spots on the dog", … ).
GPT-4
drawing a cute shoggoth with a mask:
"Seeing"
More surprising, GPT-4 can also recognize complex objects by looking only at the code of the SVG, without having ever been trained on any images[2] (AFAIK)
I first generated an articulated lamp and a rendition of the three wise apes with GPT-4 using the same method as above. Then, I sent the code of the SVG, and asked GPT-4 to guess what the code was drawing.
GPT-4 guessed
the articulated lamp (although it thought it was a street light.[3]):
And the rendition of
the three wise apes
(It can also recognize
a car,
a fountain pen, and a bunch of other simple objects[4])
The ability of seeing is interesting because it means that it has some kind of internal representation of objects and concepts that it is able to link to abstract visuals despite having never seen them before.Pinch of salt
It's worth noting that these tests were done on a limited set of objects. Further exploration would be beneficial, maybe with an objective scale for SVG diffi...
Intro
Before diving into what model psychology is, it is crucial to clarify the nature of the subject we are studying. In this post, I'll challenge the commonly debated stochastic parrot hypothesis for state-of-the-art large language models (GPT-4), and in the next post, I'll shed light on the foundations from which I am building model psychology.
The stochastic parrot hypothesis suggests that LLMs, despite their remarkable capabilities, don't truly comprehend language. They are like mere parrots, replicating human speech patterns without truly grasping the essence of the words they utter.
While I previously thought this argument had faded into oblivion, I often find myself in prolonged debates about why current SOTA LLMs surpass this simplistic view. Most of the time, people argue using examples of GPT3.5 and aren't aware of GPT-4's prowess. Through this post, I am presenting my current stance, using model psychology tools, against that hypothesis. Let's delve into the argument.
Central to our debate is the concept of a "world model". A world model represents an entity's internal understanding and representation of the external environment they live in. For humans, it's our understanding of the world around us, how it works, how concepts interact with each other, and our place within it. The stochastic parrot hypothesis challenges the notion that LLMs possess a robust world model. It suggests that while they might reproduce language with impressive accuracy, they lack a
deep, authentic understanding of the world and its nuances. Even if they have a good representation of the shadows on the wall (text), they don't truly understand the processes that lead to those shadows, and the objects from which they are cast (real world).
Yet, is this truly the case? While it is hard to give a definitive proof, it is possible to find pieces of evidence hinting at a robust representation of the real world. Let's go through four of them.[1]Argument 1: Drawing and "Seeing"
GPT-4 is able to draw AND see in SVG (despite having never seen as far as I know) with an impressive proficiency.
SVG (Scalable Vector Graphics) defines vector-based graphics in XML format. To put it simply, it's a way to describe images using a programming language. For instance, a blue circle would be represented as:
in a .svg file.Drawing
GPT-4 can produce and edit SVG representations through abstract instructions (like "Draw me a dog", "add black spots on the dog", … ).
GPT-4
drawing a cute shoggoth with a mask:
"Seeing"
More surprising, GPT-4 can also recognize complex objects by looking only at the code of the SVG, without having ever been trained on any images[2] (AFAIK)
I first generated an articulated lamp and a rendition of the three wise apes with GPT-4 using the same method as above. Then, I sent the code of the SVG, and asked GPT-4 to guess what the code was drawing.
GPT-4 guessed
the articulated lamp (although it thought it was a street light.[3]):
And the rendition of
the three wise apes
(It can also recognize
a car,
a fountain pen, and a bunch of other simple objects[4])
The ability of seeing is interesting because it means that it has some kind of internal representation of objects and concepts that it is able to link to abstract visuals despite having never seen them before.Pinch of salt
It's worth noting that these tests were done on a limited set of objects. Further exploration would be beneficial, maybe with an objective scale for SVG diffi...
