Is that video actual?
Manipulated video is everywhere. After all, our favorite movies and TV shows use editing and computer-generated imagery to create fantastical scenes all the time. Where things get hairy is when doctored videos are presented as accurate depictions of real events. There are two general kinds of deception:
- Cheapfakes: These are videos that are altered using classical video editing tools like editing dubbing, speeding up or slowing down, or splicing together different scenes to change context.
- Deepfakes: These are videos that are altered or generated using artificial intelligence, neural networks and machine learning.
Last year a video of House Speaker Nancy Pelosi circulated online that was slowed down, making it look like she was intoxicated.
Edward J. Delp, a professor at the School of Electrical and Computer Engineering at Purdue University, has been studying media forensics for 25 years. He says wider access to AI software and advanced editing tools means almost anyone can create fake content. And it doesn’t need to be sophisticated to be effective.
Edward. J Delp, Professor of Electrical and Computer Engineering, Purdue University
“People will buy into things that reinforce their current beliefs,” he says. “So they’ll believe even a poorly manipulated video if it’s about someone they don’t like or think of in a certain way.”
Delp’s team develops ways to detect fake videos. Here are some of his tips for spotting cheapfakes and deepfakes lurking around your social media feeds:
Focus on the natural details
“Look at the person in the video and see if their eyes are blinking in a weird way,” Delp says. The technology used to make deepfake videos has a hard time replicating natural blinking patterns and movements because of the way the systems are trained.
“Also, by watching their head motion, you may be able to see if there is unnatural movement.” This could be evidence that the video and audio are out of synch, or that there were time-based corrections made to parts of the video.
Make sure everything matches
“If it’s a head and shoulders shot, look at their head and body and what’s behind them,” he says. “Does it match, or is there a strange relationship between them?” Additionally: Does the lighting seem off? Does the person or some aspect of the scene appear “pasted on?” It could be manipulated.
Listen for clues
The Pelosi video “was played back at a slightly lower frame rate,” Delp explains. “The problem is, the audio track would also slow down.” So not only was her speech slow – the other sounds in the video were, too. That’s a giveaway that something’s off.
Check the metadata
This is a more sophisticated strategy that Delp’s team uses, and it could be included in detection software employed in the future by, say, media outlets. “This embedded data tells you more about the image or video, like when it was taken and what format it’s in,” Delp says. That data, a black box of sorts, could offer clues to any manipulation.
Test your knowledge
Click on the video that you believe has been manipulated. (Note: This excerpt does not have audio.)
Credit: Stanford University/Michael Zollhöfer
Researchers at the University of Washington took the audio from a speech by former President Barack Obama and then used video from a separate speech Obama gave to generate a realistic, lip-synced video of Obama giving the first speech. In other words, they changed the way his mouth moved. Watch how his jaw, chin and the lower half of his mouth move unnaturally in the manipulated version.
If you have a smartphone or have ever chatted with a virtual assistant on a call, you’ve probably already interacted with manipulated audio voices. But like fake video, fake audio has gotten very sophisticated via artificial intelligence – and it can be just as damaging.
Vijay Balasubramaniyan is the CEO and co-founder of Pindrop, a company that creates security solutions to protect against the damage fake audio can do. He says manipulated audio is the basis for a lot of scams that can ruin people’s lives and even compromise large companies. “Every year, we see about $470 million in fraud losses, including from wire transfer and phone scams. It’s a massive scale,” he says.
Vijay Balasubramaniyan, CEO and Cofounder, Pindrop Security
While some of these rely on basic tricks similar to cheapfake videos – manipulating pitch to sound like a different gender, or inserting suggestive background noises – Balasubramaniyan says running a few hours of someone’s voice through AI software can give you enough data to manipulate the voice into saying anything you want. And the audio can be so realistic, it’s difficult for the human ear to tell the difference.
However, it’s not impossible. When you’re listening for manipulated audio, here’s what to take note of:
Listen for a whine
“If you don’t have enough audio to fill out all of the different sounds of someone’s voice, the result tends to sound more whiny than humans are,” Balasubramaniyan says. The reason, he explains, is that AI programs find it hard to differentiate between general noise and speech in a recording. “The machine doesn’t know any different, so all of that noise is packaged in as part of the voice.”
Note the timing
“When you record audio, every second of audio you analyze gives between 8,000 to 40,000 data points for your voice,” Balasubramaniyan says. “But what some algorithms are going to do is just make a created voice sound similar, not necessarily follow the human model of speech production. So if the voice says ‘Hello Paul,’ you may notice the speed at which it went from ‘Hello’ to ‘Paul’ was too quick.”
Pay attention to unvoiced consonants
Make a “t” sound with your mouth, like you’re starting to say the word “tell.” Now make an “m” sound, like you are about to say “mom.” Do you notice the difference? Some consonant sounds, like t, f and s, can be made without using your voice. These are called unvoiced consonants or, in the world of audio forensics, fricatives. “When you say these fricatives, that kind of sound is very similar to noise,” Balasubramaniyan says. “They have different characteristics than other parts of vocal speech, and machines aren’t very good at replicating them.”
What it sounds like
Listen to this manipulated audio. Note the speed of the words and the placement of the consonants, and how they sound different from natural speech.
Artificial intelligence can go as far as creating entire people out of thin air, using deep learning technology like those seen in sophisticated audio and video fakes. Essentially, the program is fed thousands and thousands of versions of something – in this case, human faces – and it “learns” how to reproduce it. StyleGAN is one such program, and it’s the artificial brain behind ThisPersonDoesNotExist, a website launched by software engineer Phillip Wang that randomly generates fake faces.
This technology can easily be used as the basis for fake online profiles and personas, which can be built into entire fake networks and companies for the purposes of large-scale deception.
“I think those who are unaware of the technology are most vulnerable,” Wang told CNN in 2019. “It’s kind of like phishing — if you don’t know about it, you may fall for it.”
Fake faces can actually be easier to spot than fake video or audio, if you know what you’re looking for. A viral twitter thread from Ben Nimmo, a linguist and security expert, details some of the most obvious clues using the fake faces below.
Examine accessories like eyeglasses and jewelry
“Human faces tend [toward] asymmetry,” Nimmo writes. “Glasses, not so much.” Things like glasses and earrings may not match from one side of the face to the other, or may warp strangely into the face. Things like hats could blend into the hair and background.Credit: thispersondoesnotexist.com/Nvidia
“Backgrounds are even harder, because there’s more variation in scenery than there is in faces,” Nimmo writes. Trees, buildings and even the edges of other “faces” (if the picture is a cropped group photo) can warp or repeat in deeply unnatural ways. The same goes for hair — it might be unclear where hair stops and the background begins, or the general structure of the hair could look amiss.Credit: thispersondoesnotexist.com/Nvidia
Teeth and ears seem simple at a distance, but are highly irregular structures up close. Like the symmetry problem with glasses and jewelry, an AI program may have problems predicting the number and shape of teeth or the irregular whorls of an ear. Teeth may appear to duplicate, overlap, or fade into the sides of the mouth. The inside of an ear may look blurry, or the ears may look extremely mismatched.Credit: thispersondoesnotexist.com/Nvidia
Of course, tips for spotting manipulated media only work if something about the media – or the reaction to it – makes you suspicious in the first place. Developing the healthy skepticism and analytical power to sniff out these manipulations is not a job for your eyes or ears, but for your sense of judgment. Beefing up your media literacy skills can help you suss out when a piece of news seems suspect and help you take steps to confirm or discount it.
Theresa Giarrusso teaches media literacy to teachers, students and senior citizens across the country. She says there are different strengths needed to build media literacy.
Theresa Giarrusso, Media literacy educator and expert
“I’ve found that adults have the critical thinking skills and the history to spot misinformation, but they’re not digital natives. They don’t have the digital skills,” she says. “With teenagers, they have the digital and technical skills, but not the skepticism and critical thinking.”
Giarrusso outlines five different types of misinformation:
- Manipulated media: Photoshops, edited “cheapfakes” and some deepfakes.
- Fabricated media: Generated media, like fake faces, and some deepfakes.
- False context: When a photo, piece of video or even entire event is taken out of content and attached to a different narrative.
- Imposter media: When someone pretends to be a reputable news source, or impersonates a news source.
- Satire: Misinformation knowingly created for the purpose of entertainment or commentary.
When you come across a questionable piece of information, whether it’s being shared on Facebook by an outraged relative or spurring controversy among politicians on Twitter, Giarrusso has some tips on how to verify or reject it:
Check multiple sources
This is known as the lateral reading method. “This is how people should be researching, and how fact-checkers research,” Guiarrusso says. “Open tabs and compare and contrast. Look into the source, and then the author. If it’s a publication you don’t know, research whether the site is reliable. Is the information being reported elsewhere, and if so, how?” she says. “It doesn’t help you to get information about a bad actor from a bad actor. You have to find out what other people are saying.”
Practice the SIFT method
This method, Giarrusso says, is from Mike Caulfield at the University of Toronto. The steps are as follows:
STOP: “Don’t like, comment or share until you’ve investigated,” Giarrusso says. “Part of good disinformation is that it makes an emotional response. It’s trying to provoke an emotional reaction so you will engage with it.”
INVESTIGATE: This is where the lateral reading skill comes in.
FIND other coverage: “Are other people reporting this, is it presented in the same way? Do they have a different perspective?” Giarrusso also warns of circular reporting, when all the outlets reporting a story lead back to the same original source.
TRACE claims, quotes and media back to the original source: “This is the biggest step for deepfakes and cheapfakes,” she says. “We’re not video editors or photo editors. But if you can find the original version, you can see if there are alterations.” For instance, in the case of the Pelosi video from last year that was slowed down, “if you went back to other videos from that event, it would be quite evident,” she says.
What makes living in a world of fake and manipulated media even more confusing is that such creations aren’t always used for evil. Deepfake videos can create once-in-a-lifetime experiences for consumers and, in the case of recent ads by the TV service Hulu, allow celebrities to put their face and voice on a project without actually being there at all. AI-generated audio can change the lives of people who can’t speak.
These technologies are growing rapidly in all directions, so our methods of detecting them and protecting ourselves have to as well.
“It’s an arms race,” says Balasubramaniyan. He worries about what may happen if someone with this sophisticated technology goes after a major world leader, or ends up inventing an entire event out of thin air.
“We’re going to have to keep developing technology and machines to stay ahead of that.”
So for now, the average person may not have sophisticated algorithms or years of expertise in spotting fake media. But their five senses – and a healthy amount of skepticism – can be a first line of defense.