It was mainly out of self-amusement that Chris Ume decided to create a fake Tom Cruise.
Ume is now back and on a mission — to commercialize video deepfakes for the planned metaverse, making them as central to digital life as tweets and memes.
He’ll take that next step Tuesday, when a deepfake developed by Metaphysic, the company he formed with entrepreneur Tom Graham, will compete in the semifinals of the NBC reality hit “America’s Got Talent.”
“This is a good chance to raise awareness and show off what we can do,” said Ume.
“We think the web would be much better if instead of avatars we lived in the world of the hyper-real,” Graham added, describing users’ ability to manipulate actual faces with Metaphysic.
The start-up’s appearance on one of the most-watched summer network shows will lay the groundwork for Metaphysic’s new website, one in which ordinary people can have their faces say and do things they never did in real life. (Many other such sites are aimed at programmers and researchers.)
The act — which will follow up a raucous preliminary-round appearance that had Metaphysic overlaying a young Simon Cowell’s face on the screen above the stage — will offer a shiny advertisement for a tech that’s been democratizing with astonishing speed. On Tuesday night, the company staged its new feat, an opera “performance” with the faces of Cowell, fellow judge Howie Mandel and host Terry Crews singing dramatically. All three effused about the performance.
Yet some critics are horrified by this celebratory moment on a top-rated television show. Video deepfakes, they say, blur a line between fiction and reality that’s barely clear now. If disinformation-peddlers can have so much success with words and doctored images, imagine, they ask, what they can do with a full video.
“We’re quickly entering a world where everything, even videos, can be manipulated by pretty much anyone who wants to,” said Hany Farid, a professor at the University of California at Berkeley and an expert on deepfakes. “What can go wrong?”
The unveiling comes at the end of a frenetic summer in the world of deepfakes, which use the deep-learning of artificial intelligence to create fake media (supporters prefer “synthetic” or “AI-generated”).
While many Americans were blissfully engaging in quaint analogue activities like going to the beach, a start-up named Midjourney offered “AI art-generation,” in which anyone with a basic graphics card could with a few keystrokes create stunningly real images. To spend even a few minutes with it — there’s Gordon Ramsay burning up in his Hell’s Kitchen; here’s Gandalf shredding on a guitar — is to experience a technology that makes Photoshop look like Wite-Out. Midjourney has gathered more than a million users on its Discord channel.
And three weeks ago, a start-up named Stable AI released a program called Stable Diffusion. The AI image-generator is an open-source program that, unlike some rivals, places few limits on the images people can create, leading critics to say it can be used for scams, political disinformation and privacy violations.
“We should be worried. I follow the technology every day, and I’m worried,” said Subbarao Kambhampati, a professor at the School of Computing & AI at Arizona State University who has studied deepfakes and virtual identities. He said he expects the “AGT” moment will make platforms like these take off even further, while the technology continues to improve by the day.
“It’s moving so fast. Soon anyone will be able to create a moon landing that looks like the real thing,” he said.
Ume and Graham say deceit is not their goal. Ume emphasizes the entertainment value: The company will market itself to Hollywood studios that want to present dead actors in movies (with an estate’s permission) or have performers play against their younger selves.
For ordinary users, Ume says the aim of Metaphysic’s new site is to make online interactions feel more real — none of the whimsy of video games or flatness of Zoom. “I imagine being able to have breakfast with my grandparents in Belgium from here in Bangkok and feel like I’m really there,” said Ume from his current base.
Graham thinks synthetic media will, far from damaging privacy, bolster it. “I would like to see a world where communication online is a more humane experience owned and controlled by humans,” said Graham, a Harvard-educated lawyer who founded a digital graphics company before turning to crypto and, eventually, deepfakes. “I don’t think that happens in the Web2 world of today.”
Farid is unconvinced. “They’re only telling half the story — the one about you using your own image,” he said. “The other side is someone else using it to defraud, spread disinformation and disrupt society. And you have to ask if being able to move around a little more on Zoom is worth that.”
Deepfake technology began eight years ago with the use of “generative adversarial networks.” Created by computer scientist Ian Goodfellow, it essentially pit two AIs against each other to compete for the most realistic images. The results were far superior to basic machine-learning techniques. Goodfellow would go on to work for Google, Apple and now DeepMind, a Google subsidiary.
Early on, deepfakes were used primarily by skilled exploiters, who infamously grafted actress’s faces onto pornographic videos. But with the tech requiring fewer tools, it can increasingly be deployed by everyday people for a range of uses — which Metaphysic hopes to further.
The company earlier this year attracted a $7.5 million investment from the likes of the Winklevoss twins, the social-media-turned-crypto entrepreneurs, and Section 32, the VC fund from original Google Ventures founder Bill Maris. “We believe the impact will be far-ranging,” Andy Harrison, managing partner at Section 32, said of Metaphysic. Harrison, also a Google veteran, said he saw video deepfakes not as a threat but an enlivening change to consumption and communication.
“Frankly, I’m pretty excited,” he said. “I think it’s a new era in entertainment and social interaction.”
Critics, though, worry about the “liar’s dividend,” in which a web flooded with video deepfakes muddies the water even for legitimate videos, causing no one to believe anything.
“Video has been the last frontier of verification online. And now it could be gone, too,” Farid said. He cited the unifying power of the George Floyd video in 2020 as one example of what wouldn’t happen in a world flooded by deepfake videos.
Asked about “AGT’s” role in promoting deepfakes, a spokesperson for production company Fremantle declined to provide a comment for this story. But a person close to the show who asked for anonymity because they were prohibited legally from commenting on an ongoing competition said they believed that there was a social utility to the Metaphysic act. “By using the innovation in a completely transparent way,” the person said, “they are showing a mainstream audience how this technology can work.”
One solution to the truth issue could come in the form of authentication. A cross-industry effort involving Adobe, Microsoft and Intel aims to verify and make transparent the creator of every video to assure people it was real. But it’s not clear how many would adopt it.
Kambhampati, the ASU researcher, said he fears the world will end up in one of two places: “Either nobody trusts anything they watch anymore, or we need an elaborate system of authentication so they do.”
“I hope it’s the second,” he said, then added, “not that that seems so great, either.”
The story has been updated with details from Tuesday night’s episode of “America’s Got Talent.”