Why You Fell for the Fake Pope Coat

The pope didn’t actually wear that great jacket, but a lot of people were ready to believe he did.

An AI-generated image showing the pope in a white puffy jacket (Illustration by The Atlantic. Source: Nikita Singareddy Midjourney V5.)
A black-and-white, AI-generated image showing the pope in a white puffy jacket

Listen to this article

Listen to more stories on hark

Being alive and on the internet in 2023 suddenly means seeing hyperrealistic images of famous people doing weird, funny, shocking, and possibly disturbing things that never actually happened. In just the past week, the AI art tool Midjourney rendered two separate convincing, photographlike images of celebrities that both went viral. Last week, it imagined Donald Trump’s arrest and eventual escape from jail. Over the weekend, Pope Francis got his turn in Midjourney’s maw when an AI-generated image of the pontiff wearing a stylish white puffy jacket blew up on Reddit and Twitter.

But the fake Trump arrest and the pope’s Balenciaga renderings have one meaningful difference: While most people were quick to disbelieve the images of Trump, the pope’s puffer duped even the most discerning internet dwellers. This distinction clarifies how synthetic media—already treated as a fake-news bogeyman by some—will and won’t shape our perceptions of reality.

Pope Francis’s rad parka fooled savvy viewers because it depicted what would have been a low-stakes news event—the type of tabloid-y non-news story that, were it real, would ultimately get aggregated by popular social-media accounts, then by gossipy news outlets, before maybe going viral. It’s a little nugget of internet ephemera, like those photos that used to circulate of Vladimir Putin shirtless.

As such, the image doesn’t demand strict scrutiny. When I saw the image in my feed, I didn’t look too hard at it; I assumed either that it was real and a funny example of a celebrity wearing something unexpected, or that it was fake and part of an online in-joke I wasn’t privy to. My instinct was certainly not to comb the photo for flaws typical of AI tools (I didn’t notice the pope’s glitchy hands, for example). I’ve talked with a number of people who had a similar response. They were momentarily duped by the image but described their experience of the fakery in a more ambient sense—they were scrolling; saw the image and thought, Oh, wow, look at the pope; and then moved along with their day. The Trump-arrest images, in contrast, depicted an anticipated news event that, had it actually happened, would have had serious political and cultural repercussions. One does not simply keep scrolling along after watching the former president get tackled to the ground.

So the two sets of images are a good illustration of the way that many people assess whether information is true or false. All of us use different heuristics to try to suss out truth. When we receive new information about something we have existing knowledge of, we simply draw on facts that we’ve previously learned. But when we’re unsure, we rely on less concrete heuristics like plausibility (would this happen?) or style (does something feel, look, or read authentically?). In the case of the Trump arrest, both the style and plausibility heuristics were off.

“If Trump has been publicly arrested, I’m asking myself, Why am I seeing this image but Twitter’s trending topics, tweets, and the national newspapers and networks are not reflecting that?” Mike Caulfield, a researcher at the University of Washington’s Center for an Informed Public, told me. “But for the pope your only available heuristic is Would the pope wear a cool coat? Since almost all of us don’t have any expertise there, we fall back on the style heuristic, and the answer we come up with is: maybe.”

As I wrote last week, so-called hallucinated images depicting big events that never took place work differently than conspiracy theories, which are elaborate, sometimes vague, and frequently hard to disprove. Caulfield, who researches misinformation campaigns around elections, told me that the most effective attempts to mislead come from actors who take solid reporting from traditional news outlets and then misframe it.

Say you’re trying to gin up outrage around a local election. A good way to do this would be to take a reported news story about voter outreach and incorrectly infer malicious intent about a detail in the article. A throwaway sentence about a campaign sending election mailers to noncitizens can become a viral conspiracy theory if a propagandist suggests that those mailers were actually ballots. Alleging voter fraud, the conspiracists can then build out a whole universe of mistruths. They might look into the donation records and political contributions of the secretary of state and dream up imaginary links to George Soros or other political activists, creating intrigue and innuendo where there’s actually no evidence of wrongdoing. “All of this creates a feeling of a dense reality, and it’s all possible because there is some grain of reality at the center of it,” Caulfield said.

For synthetic media to deceive people in high-stakes news environments, the images or video in question will have to cast doubt on, or misframe, accurate reporting on real news events. Inventing scenarios out of whole cloth lightens the burden of proof to the point that even casual scrollers can very easily find the truth. But that doesn’t mean that AI-generated fakes are harmless. Caulfield described in a tweet how large language models, or LLMs—the technology behind Midjourney and similar programs—are masters at manipulating style, which people have a tendency to link to authority, authenticity, and expertise. “The internet really peeled apart facts and knowledge, LLMs might do similar with style,” he wrote.

Style, he argues, has never been the most important heuristic to help people evaluate information, but it’s still quite influential. We use writing and speaking styles to evaluate the trustworthiness of emails, articles, speeches, and lectures. We use visual style in evaluating authenticity as well—think about company logos or online images of products for sale. It’s not hard to imagine that flooding the internet with low-cost information mimicking an authentic style might scramble our brains, similar to how the internet’s democratization of publishing made the process of simple fact-finding more complex. As Caulfield notes, “The more mundane the thing, the greater the risk.”

Because we’re in the infancy of a generative-AI age, it’s too premature to suggest that we’re tumbling headfirst into the depths of a post-truth hellscape. But consider these tools through Caulfield’s lens: Successive technologies, from the early internet, to social media, to artificial intelligence, have each targeted different information-processing heuristics and cheapened them in succession. The cumulative effect conjures an eerie image of technologies like a roiling sea, slowly chipping away at the necessary tools we have for making sense of the world and remaining resilient. A slow erosion of some of what makes us human.

Charlie Warzel is a staff writer at The Atlantic and the author of its newsletter Galaxy Brain, about technology, media, and big ideas. He can be reached via email.