I have long said that Deepfakes missed the boat on being stealthy, believable pieces of footage able to turn the tide of elections or other major events. We’ve seen time and again how suggested examples of use during important happenings have been terrible, whereas the smart use has tended to be quiet, low level affairs as a stepping stone rather than an end goal.

The other stance on this is that people will find moving imagery more believable than photographs or AI generated text. This is an entirely fair point of view, and we can’t really say for sure that nothing seriously bad on the global scale will ever come from a deepfake. However, one study suggests the fakers still have a way to go before that becomes a possibility.

Audio and visual clues

The Register reports that researchers at MIT have pulled the rug on the whole deepfake issue. Just over 5,000 people participated in a video/audio versus text transcript showdown. The task: figure out whether audio, video/audio, and text transcripts of Joe Biden and Donald Trump were real or fake. The results:

Video and audio: 82% guessed correctly

Audio only: 76% guessed correctly

Text only: 57% guessed correctly

One would assume text only is going to be quite the dice-throw. In terms of video, are we getting better at spotting the joins, the edits, the slight uncanny valley effect in most deepfake content? Have we somehow trained ourselves to know when something isn’t right by virtue of having seen so many pieces of deepfake content over the last few years?

These aren’t questions we have answers for yet, although as always, we may have more pressing concerns anyway.

Who needs Deepfakes?

I’m a fan of the path of least resistance idea where deepfakes are concerned. That is to say, are there simpler ways to achieve the desired effect of a deepfake with simpler methods? And, if so, why even reach for the deepfake in the first place? It’s a lot of hard work for something with a big risk of little to no pay off.

A good case in point: the many, many pieces of dis/misinformation currently surrounding events in Ukraine. Deepfakes aren’t being used; it’s just regular footage spliced in whatever way is required. Tricky, sophisticated AI generated fakes may not be required when simple photographs or viral videos are mislabeled and made viral.

In the last few days alone, we’ve seen several examples of this phenomenon doing big numbers on social media, quite often promoted by verified (and perhaps mistaken as authoritative) commentators on social media:

Digitally doctored video versus jpegs

Commentators continue to warn of the dangers of deepfakes, but look at the example from this video. His head moves oddly and doesn’t look properly connected to the peculiarly flat-looking body. The mouth moves strangely at various points throughout. Even without using tools to analyse the footage, there’s clearly something very wrong with the content. So, we’re right back where we started: tried and tested methods, with less demanding technical overheads. When upwards of 14 million people are being told Steven Seagal is on the frontline thanks to a hasty image edit, the impact of overwrought deepfakes recedes into the distance.  

When fakes fight fakes

On top of everything else, we have the peculiar sight of face-swap apps trying to disseminate real information. With what is claimed to be 9 million messages sent out related to one campaign, and 2 million of those being sent to users in Russia, it’s arguable that the most interesting and pervasive use of deepfake tech during a major event is to essentially cancel its own power.

Strange times indeed for the AI-altered revolution which never quite seems to land.