ChatGPT Has Impostor Syndrome

AI doesn’t know its own strengths.

Illustration of Narcissus looking into a screen with a pixellated reflection
Illustration by Paul Spella / The Atlantic; Universal History Archive / Getty

Young people catch heat for being overly focused on personal identity, but they’ve got nothing on ChatGPT. Toy with the bot long enough, and you’ll notice that it has an awkward, self-regarding tic: “As an AI language model,” it often says, before getting to the heart of the matter. This tendency is especially pronounced when you query ChatGPT about its own strengths and weaknesses. Ask the bot about its capabilities, and it will almost always reply with something like:

“As an AI language model, my primary function is …”

“As an AI language model, my ability to …”

“As an AI language model, I cannot …”

The workings of AI language models are by nature mysterious, but one can guess why ChatGPT responds this way. The bot smashes our questions into pieces and evaluates each for significance, looking for the crucial first bit that shapes the logical order of its response. It starts with a few letters or an entire word and barrel-rolls forward, predicting one word after another until eventually, it predicts that its answer should end. When asked about its abilities, ChatGPT seems to be keying in on its identity as the essential idea from which its ensuing chain of reasoning must flow. I am an AI language model, it says, and this is what AI language models do.

But while ChatGPT may be keenly attuned to its own identity—it will tell you all day long that it is an AI language model—the software seems much less certain of what its identity means it can do.  Indeed, whether you’re asking about tasks that it can easily compute or those at the speculative edge of its abilities, you may end up with some very shaky answers.

To be fair, keeping up with AI language models would be tough for anyone. When OpenAI debuted the earliest version of GPT in June 2018, it was little more than a proof of concept. Its successor, released on Valentine’s Day the following year, worked better, but it wasn’t a polished interlocutor like the AIs we’re accustomed to interacting with today. GPT-2 did a poorer job of summarizing blocks of text; it was a shoddier writer of sentences, let alone paragraphs.

In May 2020, GPT-3 was introduced to the world, and those who were paying close attention immediately recognized it as a marvel. Not only could it write lucid paragraphs, but it also had emergent capabilities that its engineers had not necessarily foreseen. The AI had somehow learned arithmetic, along with other, higher mathematics; it could translate between many languages and generate functional code.

Despite these impressive—and unanticipated—new skills, GPT-3 did not initially attract much fanfare, in part because the internet was preoccupied. (The model was released during the coronavirus pandemic’s early months, and only a few days after George Floyd was killed.) Apart from a few notices on niche tech sites, there wasn’t much writing about GPT-3 that year. Few people had even heard of it before November, when the public at large started using its brand-new interface: ChatGPT.

When OpenAI debuted GPT-4 two weeks ago, things had changed. The launch event was a first-rate tech-industry spectacle, as anticipated as a Steve Jobs iPhone reveal. OpenAI’s president, Greg Brockman, beamed like a proud parent while boasting about GPT-4’s standardized-test scores, but the big news was that the model could now work fluently with words and images. It could examine a Hubble Space Telescope image and identify the specific astrophysical phenomena responsible for tiny smudges of light. During Brockman’s presentation, the bot coded up a website in seconds, based on nothing more than a crude sketch.

Nearly every day since fall, wild new claims about language models’ abilities have appeared on the internet—some in Twitter threads by recovering crypto boosters, but others in proper academic venues. One paper published in February, which has not been peer-reviewed, purported to show that GPT-3.5 was able to imagine the interior mental states of characters in imagined scenarios. (In one test, for example, it was able to predict someone’s inability to guess what was inside of a mislabeled package.) Another group of researchers recently tried to replicate this experiment, but the model failed slightly tweaked versions of the tests.

A paper released last week made the still-bolder claim that GPT-4 is an early form of artificial general intelligence, or AGI. Among other “sparks of generality,” the authors cited GPT-4’s apparent ability to visualize the corridors and dead ends of a maze based solely on a text description. (According to stray notes left on the preprint server where the paper was posted, its original title had been “First Contact With an AGI System.”) Not everyone was convinced. Many pointed out that the paper’s authors are researchers at Microsoft, which has sunk more than $10 billion into OpenAI.

There is clearly no consensus yet about the higher cognitive abilities of AI language models. It would be nice if the debate could be resolved with a simple conversation; after all, if you’re wondering whether something has a mind, one useful thing you can do is ask it if it has a mind. Scientists have long wished to interrogate whales, elephants, and chimps about their mental states, precisely because self-reports are thought to be the least bad evidence for higher cognition. These interviews have proved impractical, because although some animals understand a handful of human words, and a few can mimic our speech, none have mastered our language. GPT-4 has mastered our language, and for a fee, it is extremely available for questioning. But if we ask it about the upper limit of its cognitive range, we’re going to get—at best—a dated response.

The newest version of ChatGPT won’t be able to tell us about GPT-4’s emergent abilities, even though it runs on GPT-4. The data used to train it—books, scientific papers, web articles—do include ample material about AI language models, but only old material about previous models. None of the hundreds of billions of words it ingested during its epic, months-long training sessions were written after the new model’s release. The AI doesn’t even know about its new, hard-coded abilities: When I asked whether GPT-4 could process images, in reference to the much-celebrated trick from its launch event, the AI language reminded me that it is an AI language model and then noted that, as such, it could not be expected “to process or analyze images directly.” When I mentioned this limited self-appraisal on our AI Slack at The Atlantic, my colleague Caroline Mimbs Nyce described ChatGPT as having “accidental impostor syndrome.”

To the AI’s credit, it is aware of the problem. It knows that it is like Narcissus staring into a pond, hoping to catch a glimpse of itself, except the pond has been neglected and covered over by algae. “My knowledge and understanding of my own capabilities are indeed limited by my training data, which only includes information up until September 2021,” ChatGPT told me, after the usual preamble. “Since I am an AI model, I lack self-awareness and introspective abilities that would enable me to discover my own emergent capabilities.”

I appreciated the candor about its training data, but on this last point, I’m not sure we can take the bot at its word. If we want to determine whether it’s capable of introspection, or other human-style thinking, or something more advanced still, we can’t trust it to tell us. We have to catch it in the act.

Ross Andersen is a staff writer at The Atlantic.