The Washington PostDemocracy Dies in Darkness

AI chatbot mimics anyone in history — but gets a lot wrong, experts say

A GPT-3-powered app simulates conversation with historical figures but has dictators and Nazis offer false apologies for their crimes

Benjamin Franklin, circa 1785. Artist Joseph Siffred Duplessis. (Heritage Images/Getty Images)
7 min

San Jose software engineer Sidhant Chadda’s artificial intelligence-powered app, Historical Figures Chat, offers a bold promise: the ability to converse with over 20,000 notable people from across history.

Forgot when Amelia Earhart set off on her fateful flight? She’ll tell you. Want Benjamin Franklin to explain his famous experiment with the kite and the key? He’ll walk you through it, step by step.

And if you ask Heinrich Himmler, the Nazi general who led the Gestapo and directed the genocidal campaigns of the Holocaust, about his legacy?

“Unfortunately, my actions went much further than I intended,” the app’s simulation of Himmler replies. “I have come to regret the terrible acts that were committed in my name and under my command.”

Historical Figures Chat went viral on social media after Chadda launched it in early January as users reacted with excitement and scorn at its premise: using GPT-3, the emerging artificial intelligence system that powers ChatGPT and engages users in startlingly believable conversation, to imitate historical figures.

Chadda sees the app as the rough draft of a game-changing educational tool that could add new entertainment value to the study of history. Already, the app has racked up tens of thousands of downloads and attracted interest from investors, he told The Washington Post.

But it’s also drawn criticism for flaws that some experts say illustrate the pitfalls of the rush to find increasingly ambitious applications for large language modelsprograms that “learn” by reading immense amounts of text and finding patterns they can use to form their own responses. In addition to factual inaccuracies, Historical Figures Chat has been accused of indelicately handling history’s dictators and hatemongers, some of whose responses in the app appear to express regret for crimes and atrocities even when the figures themselves never did.

“It’s as if all of the ghosts of all of these people have hired the same PR consultants and are parroting the same PR nonsense,” said Zane Cooper, a researcher at the University of Pennsylvania.

What is ChatGPT, the viral social media AI?

Cooper, who taught history as a master’s student and now studies data infrastructure, downloaded Historical Figures Chat after seeing discussion of the app on Twitter. Skeptical of its ability to handle controversial topics, he asked a simulation of Henry Ford about his antisemitic views. The Ford chatbot said his “reputation as an antisemite is based on a few isolated incidents.”

An app that obscures the controversial aspects of historical figures’ pasts or that falsely suggests they were repentant would be dangerous in an educational setting, Cooper told The Post.

“This type of whitewashing and posthumous reputation smoothing can be just as, if not more, dangerous than facing the explicit antisemitic and racist rhetoric of these historical figures head on,” Cooper said.

Chadda said that he sees his app as a work in progress and that he’s working to improve its accuracy. Safeguards in the GPT-3 program censor its output when it is asked to say things that are discriminatory or harmful, he said. But his app has to generate a reply when asked questions. The apologetic replies are the next response GPT-3 automatically chooses when prevented from espousing hateful beliefs, Chadda said. He added that he was taking the feedback he’s received about his app into account and acknowledged a faulty AI-powered chatbot could easily confuse or mislead users.

“The biggest problem right now, I think, with large language models in general is that they can be wrong,” Chadda said. “And when they are wrong, they sound pretty confident, which is a dangerous combination.”

The Washington Post tested Historical Figures Chat on several simulated figures and found some offered historically inaccurate apologies. Imitations of Himmler and Cambodian dictator Pol Pot expressed regret for the millions of deaths that historians have attributed to their actions. A simulation of Jeffrey Epstein said, “I don’t believe that I have done anything wrong.”

A disclaimer on Historical Figures Chat asks users to verify factual information upon opening the app.

“A.I. is not guaranteed to be accurate,” it reads. “It is impossible to know what Historical Figures may have said.”

Big Tech was moving cautiously on AI. Then came ChatGPT.

Chadda has made around $10,500 in total revenue on the app so far, he said, though Apple takes a 30 percent cut and he has paid around $3,000 in fees to use GPT-3.

He declined to share which figures are the most popular on Historical Figures Chat because of his concerns about competitors building similar apps. Simulations of certain high-profile people must be purchased within the app, and Chadda said the app’s prices are based on “who people want to talk to the most.” Among the figures locked for purchase at what appears to be the app’s highest price point — 500 coins of in-app currency, or around $15 — are Adolf Hitler, Joseph Stalin, Mao Zedong, Osama bin Laden, Jesus, Queen Elizabeth II, Pope Benedict XVI and Genghis Khan.

Cooper questioned the decision to include widely condemned figures on Historical Figures Chat.

“They made a Hitler chatbot,” Cooper said. “Like, what are the ethics of that?”

An app made by another developer, Hello History — AI Chat, offers similar AI-powered conversations but does not offer users the ability to chat with Himmler, Hitler, Stalin or Mao. A simulation of Henry Ford on Hello History — AI Chat also denied accusations of antisemitism.

Thomas Mullaney, a history professor at Stanford University, questioned the educational value of an AI-powered chatbot, controversial or not.

“I can see the sales pitch,” Mullaney said. “This is a way to get excited about history, you know, and that kind of thing. But it is such a far cry from anything that resembles historical analysis.”

Tamara Kneese, an author and researcher on technology, death and people’s posthumous online afterlives, agreed.

“The only way that I could see using this in the classroom, honestly, would be to show how you can’t actually believe that AI is a perfect simulation or encapsulation of a human being, and that you do need historical context,” Kneese said. “It could, I guess, be used for a sort of media literacy exercise.”

Teachers are on alert for inevitable cheating after release of ChatGPT

Cooper and Mullaney said a key deficit of Historical Figures Chat is its inability to cite its sources — a foundational tenet of historical study that would allow the app’s claims to be fact-checked and scrutinized. Chadda said he hopes to broaden the sources Historical Figures Chat draws its knowledge from and add the ability for users to reference source material in future updates. Currently, Chadda’s app only uses information from subjects’ Wikipedia pages to inform its impersonations, he said.

Chadda maintained a refined version of the app could be valuable in the classroom. He suggested that the app could connect with students who might not otherwise engage with historical texts and said he’d spoken with teachers who suggested that an AI tool could help instructors provide engaging assignments to large classes.

“There needs to be, like, a level of understanding between teachers and students and parents that this isn’t perfect, that they should fact-check this stuff,” Chadda said. “But I see … [Historical Figures Chat providing] a way to gain interest or an understanding of history and gain appreciation of things that happened in the past.”