What shifted over the past few years is that the Internet made falsifying news items valuable in a way that it wasn’t before. There’s indirect value in sharing things that attract a lot of attention: likes, follows and so on on social media. But there’s also money to be made. A group of teenagers in Macedonia built a large network of sites spreading false information shortly before the 2016 election not because they had a dog in the fight but because saying outrageous things drove a lot of traffic to them — which in turn drove a lot of ad revenue.
It’s also a moment in which there’s direct political value in spreading misinformation. Setting aside the value that Trump himself sees in misrepresenting reality, there has been a deliberate effort to spread inaccurate information about political candidates and issues — Hillary Clinton, Brexit, etc. — to both shape public opinion and to make it harder to discern real from fake. If you see one article claiming that millions of people were caught voting illegally and another reporting (correctly) that there is no evidence at all of widespread voter fraud, the issue may seem as though it’s still being litigated.
There’s been a sudden interest in a new generation of tools that can be used to create misleading news stories, driven in part by users of the site Reddit who figured out a relatively easy way to superimpose celebrities’ faces onto videos of porn stars. Until last week, there was a community on Reddit called “deepfakes” that passed around not only modified porn but also examples of other video clips in which one person’s face was substituted for another’s.
Like swapping Nicholas Cage’s face in for Amy Adams singing “I Will Survive.”
This is not exactly realistic, you’ll notice. But it’s easy to see how, using a different model and a different voice, this could be more convincing.
There are two other things to remember here. The first is that, while professional animators and computer-effects specialists have been able to create fairly realistic virtual people for some time (including virtual versions of real people), better technologies mean that it’s easier for nonprofessionals to create artificial videos such as the one above. More remarkable than the deepfakes tool, in some ways, was the deepfakes community, a group of people — not just one expert — who all had at their disposal a way to put Person X’s face on Person Y’s body.
The other thing to remember is how much the bar has slipped for what can easily be forged. Three decades ago, creating and spreading a fake version of a newspaper article involved significant photo-editing skills and real-world printing technology all to create something that would have to be shared by hand. The introduction of Photoshop made that easier, and the introduction of the Web made sharing doctored images fairly trivial, as we’re reminded each time a shark swims on a highway after a natural disaster. But creating a fake recording of someone’s voice or a fake video of a celebrity doing something? That was significantly harder — in the past, anyway.
Another layer of complexity stems from our being in a weird technological moment. There’s lots of low-quality footage and photographs of important people out there, thanks to cameras’ slowing evolution over the course of an adult American’s lifetime. It’s easier to create convincing grainy, low-res video than high-definition video, such as if you wanted to, say, create a video of a political candidate saying something controversial back in the 1990s.
The deepfakes technology is not the only tool that will probably become broadly available in the upcoming months and years. We’ve compiled a number of existing technologies that may in the future be used to create increasingly realistic versions of photos and videos that could be used to mislead news consumers.
Changing how real people appear to behave
Manipulating the gaze of someone in a photograph
How this could be used: To change an apparent reaction of the subject of a photograph.
Researchers Yaroslav Ganin, Daniil Kononenko, Diana Sungatullina and Victor Lempitsky of the Skolkovo Institute of Science and Technology created a system that allows for the manipulation of a person’s eyes in a photo. Notice how as Obama’s eyes move, above, the light in them doesn’t change.
There’s an online demo that allows you to manipulate photos you upload. (Be aware that uploading images to random websites is itself something that should be done with caution.)
Changing the expressions of someone in a video
How this could be used: Footage of an audience (say, at a State of the Union address) could be manipulated to make it seem as though a member of Congress were saying something that wasn’t actually said based on how their mouth is moving.
Face2Face is an early example of the ability to change a person’s appearance in a recorded video. In short, the process, introduced in 2016 by researchers Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt and Matthias Niessner, creates a sort of video marionette that can be made to present whatever facial expressions the user wants.
This has an obvious shortcoming: It doesn’t manipulate the audio of the original video.
Creating a fake video of a person speaking using his or her own voice
How this could be used: To change the context of where comments were made. Something Barack Obama said on the campaign trail — or in one of his audiobooks — could be made to look like something he said in the Oval Office.
This technology was introduced at the SIGGRAPH conference last year by the University of Washington’s Supasorn Suwajanakorn, Steven M. Seitz and Ira Kemelmacher-Shlizerman. It gets a little closer to the ability to generate a fully artificial version of a speech from a real person.
The full paper makes clear how tricky this seemingly simple process is, requiring the creation of a virtual model of Obama’s face and systems for correcting specific visual anomalies.
Manipulating the words of someone using samples of their own voice
How this could be used: To create a controversial statement by a political candidate.
This is a remarkable video from Adobe Systems. Sentences can be rearranged to make it sound as though the source for a voice said something that was never said.
Think about a newly discovered recording of someone having used a racial slur back in college. The recording is scratchy and at times muffled. Would you feel confident that it was legitimate if you stumbled upon it online?
Now imagine that this technology is combined with the technology above: building out an artificial version of what someone said and where they said it.
Creating fake people
We live in a world, too, where random people can suddenly become important contributors to the national conversation, thanks to social media. The ability to generate realistic human actors from scratch creates an ability to create witnesses who don’t exist to events that didn’t happen.
Creating realistic photographs of nonexistent people
How this could be used: To mask a fraudulent social media account by using a unique identifying image.
The New York Times reported on an effort by Nvidia to generate realistic-looking versions of photographs of people. The results are often strikingly good (and often quite poor).
One way in which media outlets check the validity of social media accounts is to check whether the photos used as the accounts’ avatars exist online. For a normal person such as you, the photo you use on Facebook or Twitter is just one of hundreds you probably have on your phone. For someone creating fake bot accounts, photos are often pulled from somewhere else online. You’re one person who takes lots of photos of yourself; they’re one person who’s creating lots of other people with no original photos at all. This technology would allow for the creation of a number of seemingly realistic photos that can’t be tracked back to someone else online but that appear to bolster the legitimacy of the account.
Creating realistic speaking voices for nonexistent people
How this could be used: To leave a voice mail for a reporter about something that didn’t actually happen.
Google, meanwhile, has created Tacotron 2, a “neural network architecture for speech synthesis directly from text.” In English, it’s a process for improved artificial voices. In the clip above, you can hear a generated version of the sentence “He thought it was time to present the present” — including the proper pronunciation of the two forms of “present.”
The search giant’s goal, obviously, is to improve the responses of tools such as its Google Home interactive device. But the ability to create realistic artificial voices has obvious other applications.
Changing the appearance of where something took place
Adding a virtual environment to live video
How this could be used: In an extreme example, imagine a military officer appearing to conduct an interview from within the Oval Office.
Part of the shift in the ability to create realistic forgeries of photos and videos is the expansion of access to powerful computers. Not only is there a technological question behind creating virtual environments and so on, but also the ability of a computer to draw those environments quickly has improved dramatically.
The video above, first posted to Facebook by Oscar Olarte Ruiz, shows the real-time addition of an actor to an existing virtual scene.
Changing the time of day or weather in a photo or video
How this could be used: To make an event that happened in Alaska appear to have happened in Washington — or in the spring.
Researchers Ming-Yu Liu, Thomas Breuel and Jan Kautz created a tool that can literally change winter to summer or day to night. (Or, we’ll note, one breed of dog to another, should you want to do that.)
We include this example last because it gives a sense of the scope of what’s possible. Existing video can be altered to change the sky and the trees and even whether it’s light or dark. Everything about where the video was created can be turned into its opposite.
The effect of these technologies becoming commonplace could be significant. A world in which it’s hard to tell reality from forgeries is a world in which misinformation and falsehood thrive.
Reporter Charlie Warzel spoke with technologist Aviv Ovadya, who before the 2016 election warned about the ecosystem of misinformation that was thriving online.
“We were utterly screwed a year and a half ago, and we’re even more screwed now,” Ovadya told Warzel for BuzzFeed. “And depending how far you look into the future, it just gets worse.”
How were we screwed a year and a half ago? In part because the currency of information sharing on the Web is the website, and websites themselves are very easy to fake and manipulate for images. In part because of the sharing economy that exists. But mostly for the reason that we’ll continue to be in trouble: Regardless of economics, there’s often political value in providing people with untrue information about those they dislike.
Even as we were writing this, a forged document was being peddled on social media to allege corruption by an elected official. It’s been shared thousands of times and required nothing more sophisticated than Photoshop and a Twitter account.