This article is taken from the second print issue of CYBR – ISSUE_02 S I M / U L A T I O N – Buy here
Our last election cycle was plagued by the term ‘fake news’, one which referred to a vast catalog of misinformation that spread primarily on Facebook and much of the internet. It’s a phrase that is being banded around by a reality TV star in charge of 300 Million people, but misinformation spread by hyper-targeted Facebook campaigns is old technology, the real threat comes from a neural network algorithm that can synthesise anyone. If we thought we already had a fake news problem, it’s about to get a whole lot worse.
In late 2017 The University of Washington, released one of its latest projects ‘Synthesizing Obama’, headed up by Ira Kemelmacher-Shlizerman. The team’s technology could be trained on simple YouTube videos of the subject and synthesise lip movements within video to match any audio output.
This technology can take a video clip of any speech, address, or interview and seamlessly make the subject say whatever the user wishes, with so much realism, the public will never know the difference. In short, it’s now possible to make anyone say anything you wish, as long as there is a video of them, and an audio file of your choosing.
It goes without saying the political and media implications are colossal. Presidents of countries can be synthesised to be saying words that never uttered their lips. Deciphering misinformation and propaganda could become a serious challenge. Video authentication could suddenly become big business.
But this technology has positive use case scenarios too, for those of you dreaming of a virtual world, or to hold a conversation with Steve Jobs, Prince or even a past president, this really could be the beginning of the answer.
Kemelmacher-Shlizerman states: “Realistic audio-to-video conversion has practical applications like improving video conferencing for meetings, as well as futuristic ones such as being able to hold a conversation with a historical figure in virtual reality by creating visuals just from audio. This is the kind of breakthrough that will help enable those next steps.”
Whilst the developed algorithm is designed to learn one individual at at time, the team want to improve their efforts by adding an ability to recognize a person’s voice and speech pattern. This could bring the time in which the algorithm needs to learn down from half a day to just one hour.
Halfway across the world in Garching. Germany, The Niessner Lab at the Technical University of Munich are experimenting with similar algorithms, synthesis and ways to detect forgeries. This year the lab released their real-time face mapping. A technology that allows real-time facial mapping from an actor on to a video clip. Resulting in actors manipulating in real-time the facial expressions of everyday civilians, actors, celebrities, and even Presidents.
Named ‘HeadOn’ the technology enables transfer of torso and head motion, face expression, and even eye glaze. The result is photo realistic simulation. What’s even more poignant is that the hardware needed to achieve this is simply an iPad and a structure.io 3D sensor.
HeadOn is striking when it is seen in action, it is extremely difficult to tell reality from fake. Between a number of TUM’s projects, one can say with certainty that some of these synthesis are indistinguishable.
Given that fact, Niessner Lab have used their technology to turn the tables on video synthesis, recently releasing ‘FaceForensic’. This is a study into video forgeries, creating a dataset of over 500,000 frames and an accompanying algorithm to tackle fake imagery and video. The dataset maps the facial area within a video and targets multiple discrepancies, it can also be trained by systems like ‘HeadOn’ and ‘Face2Face’ to automatically recognise forgery.
Whilst these technologies could be the early stages of an incredibly rich and versatile virtual reality environment. The ability to realistically create person to person interaction holding conversations with an endless choice of historical figures. They also are a pandora’s box for misinformation and propaganda. New technologies will always be a ‘cat and mouse’ situation, and to tackle this comprehensively, we need to turn to education.
Awareness of these technologies is key, but critical thinking is now as important as ever. A Facebook news feed, Youtube video or broadcast TV station needs to be evaluated objectively by its viewer. There’s no doubt that many already apply critical thinking to Fox News, RT, and CCTV among others. We rapidly need a change in mindset to apply the same to all content, and remove a sense of automatic trust.
It’s especially difficult when this content can be shared by friends and even family, automatically creating a false sense of security. But with the work of The University of Washington and the Niessner Lab certain to become more common place over the next few years, it’s incredibly important to be aware just how this could change our media landscape.