one of the largest databases available with almost 15K deepfake videos

Get Data


DF-Mobio dataset is one of the largest databases available with almost 15K deepfake videos. Original videos (31K in total) are taken from Mobio database, which contains videos of a single person talking to the camera recorded with a phone or a laptop. The scenario simulates the participation in a virtual meeting over Zoom or Skype. The deepfakes in DF-Mobio were generated for 72 pairs of subjects that were manually selected from the original Mobio dataset.

We used GAN model from a modified source code from here: https://github.com/shaoanlu/faceswap-GAN. GAN was trained on face size input of 256 × 256 pixels. For each pair of subjects, we generated videos with swapped faces from subject one to subject two and vice versa. The training images were generated from laptop-recorded videos at 8 fps, resulting in more than 2K faces for each subject. The training was done for 40K iterations (about 24 hours on Tesla P80 GPU). Using the trained model, we then generated deepfakes from all laptop and mobile-shot videos available for that pair of subjects in the original database.



If you use this dataset, please cite the following paper:

Anubhav Jain, Pavel Korshunov, and Sebastien Marcel, "Improving Generalization of Deepfake Detection by Training for Attribution", International Workshop on Multimedia Signal Processing (MMSP), October 2021.