Just a few years ago, words like a neural network, artificial intelligence, or machine learning were completely unknown to the average person and were considered something from the category of futurists’ fantasies. However, they already allow you to do incredible things, from processing photos on a cheap Chinese phone to detecting diseases with the highest accuracy. Examples of the work of neural networks and modern technologies can be seen in films, they help to revive dead actors or rejuvenate the aged. tells how you can return your favorite characters to the screens.

Manual Labor

Using computer graphics to resurrect deceased actors is far from a new practice. In 2013, Audrey Hepburn, who passed away 20 years ago, appeared in a commercial for Galaxy chocolate. A virtual copy of the actress was placed in a setting reminiscent of the Italian Amalfi coast from the movie Roman Holiday. The children of the actress not only agreed to this but also specifically emphasized that Audrey would be proud of this role because she always loved chocolate. But one of the most famous examples was featured in Star Wars: Skywalker. Sunrise, which was released three years after the death of actress Carrie Fisher.

JJ Abrams said that he did not want to look for a new actress for the role of Princess Leia Organa, so the studio resorted to studying the already shot frames to generate the actress’s appearance. Moreover, the image of the actress was almost completely recreated by hand. We used ready-made materials from the first days of filming and fragments from past films to digitize the model of the actress and place her image in similar scenes. “We started looking at what kind of frames they were, we started writing scenes around these frames, completely new contexts, new places, new situations. The only thing I will say is that whenever you see Carrie, we have completely created, illuminated, and composed footage around the original work that we had, explained Abrams. This was done during the filming of Fast and the Furious 7, which began filming in September 2013, and two months later, actor Paul Walker died in a car accident. The filmmakers decided not to reshoot the footage that had already been shot but used it to recreate the actor’s appearance, which they used again in the eighth part of the film.

The creators of the Vietnam War action movie Finding Jack digitized the cult Hollywood actor James Dean, who died in 1955, for their film. By appointment the era of neural networks and machine learning it is too expensive and time-consuming to impose such tasks on the shoulders of people. Deepfakes have become one of the most popular examples of the use of modern technologies to generate someone else’s appearance. For the first time, deepfake (from deep learning – deep learning and fake – fake) was talked about in 2017, when the first videos with people who were replaced by famous actors began to appear on Reddit. In a general sense, this technology, based on generative adversarial neural networks (GAN), allows you to manipulate audio and video content so that a famous person in the frame does something that he has never really done.

In the vast majority of cases, such videos are created using a neural network architecture called an autoencoder. It consists of two parts: the first one learns to encode the original image, and the second one learns to decode it so that the original face is replaced with the one that you wanted to superimpose. In this case, the position of the eyes, nose, and mouth should remain the same as the person had in the original picture. Sometimes this architecture is leveraged with a generative adversarial neural network (GAN). The mechanism is as follows: the generative part of the algorithm learns from real photographs of a certain person and creates an image, literally competing with the discriminative part of the algorithm, until the latter begins to confuse the generated fake picture with the original. Thus, the encoder and decoder are responsible for the transfer of the face, SBER AI Andrey Chertok.

The technology was developed by Stanford University student Ian Goodfellow back in 2014 after which he got a job at Google, and then at Elon Musk’s OpenAI company. And in 2017 a user with the nickname Deepfake appeared on Reddit and began uploading porn videos where famous actors played the main roles. This has spawned a wave of adult content featuring famous actresses like Gal Gadot or Daisy Ridley. Shortly thereafter, Reddit banned the publication of such materials, and in the United States gradually introduce at the legislative level a ban on the publication of political or pornographic dipfeykov. However, in 2019, experts reported that 96 percent of deepfakes are somehow related to adult videos.

Such a share of porn videos among all deepfakes on the Internet, Despite this the technology has shown that it is possible to reliably replace an actor in the frame or resurrection already deceased person at home. This gave rise to a lot of playful videos, where instead of all the actors could be Jim Carrey, and quite curious ones, where, for example, fans tried on the actor’s appearance to the image of their favorite hero of the Marvel Universe. This trend also affected politicians’ deepfakes that were actively used during campaigning for the presidential elections in the United States. One of the first examples of the use of technology in serious work was a public service advertisement about the dangers of malaria, with the participation of David Beckham. Using neural networks, the creators translated the video into nine different languages, preserving the football player’s voice and articulation.

Favorite Heroes

However, until now, no one has tried to return the hero of the past to the screens only with the help of neural networks. For many Diffey are still a way to create memes and funny commercials, but the technology used much of shy potential. A full range of techniques for recreating the actor’s appearance, manners, and voice was shown by Sberbank in an advertisement with the hero of the film Ivan Vasilyevich Changes His Profession” Georges Miloslavsky. This is the first example of such an integrated approach in a large production facility using raw data from an old movie. According to the plot, Miloslavsky finds himself in 2020 and learns that Sberbank is now not just a bank, but a technology giant. One of the most difficult tasks was to recreate the voice of actor Leonid Kuravlev.

Compared to the use of classical computer graphics techniques, neural networks turned out to be faster and cheaper. According to specialists from the STC Group of Companies (Center for Speech Technologies), the industry of premium voices for Enterprise-level clients usually requires dozens of hours of the studio recording of an announcer. In this case, it was necessary to collect materials from the films already available, and in the course of choosing the optimal fragments, only seven minutes of speech were accumulated. Also, it was a non-standard task with a short implementation time.

First, we collected data to train TTS (text-to-speech technology of speech synthesis, that is translation of the written text into sounding speech, approx. The main source of speech material was fragments of audio recordings from films with the participation of Leonid Kuravlev. Although we tried to extract the cleanest recordings from the films, some of them were still accompanied by extraneous sounds (city noise, sounds of nature, etc.) or music, we tried to fix this. Any artifacts will certainly seep into the neural model of the speaker’s voice, especially when it comes to such a small training sample. We managed to completely separate the music from the voice without significant loss of quality in about half of the collected examples.

As a result, we had exactly 4 minutes and 12 seconds of pure speech of Leonid Vyacheslavovich Kuravlev’s voice, then we worked on emotions at the same time, the company added that the synthesized voice of Leonid Kuravlev is indistinguishable from the real one only for the hearing of an amateur. It’s a synthesis after all. The system for detecting spoofing attacks (hacking attempts) can identify specific characteristics of the sound, indicating that the voice is not alive. The attention to such possible attacks is in many ways what distinguishes high-quality developers today. Anti-spoofing – protection from hackers – is often devoted to special scientific competitions, including world ones, where the MDG team won more than once. In general, the MDGs believe that developers should not only create new technologies and products based on them but also constantly look for new means of protection, not stopping there.