Vulnerability assessment and detection of Deepfake videos

Database

The database of Deepfake videos that is used in the experiments is DeepfakeTIMIT.

It is important to note that we do not provide the codebase we used to create DeepfakeTIMIT database. We have used an open souce GitHub code but we have heavily modified it to streamline the batch training and video generation of Deepfake models. And while we released the resulted video data to facilitate creation of Deepfake detectors, we would rather not make it even easier to generate such videos.

Usage instructions

This package provides the source code for reproducing the experimental results presented in the above papers. The first part of the experiments demonstrates the vulnerability of state of the art face recognition systems to Deepfake videos, i.e., the systems get confused by the Deepfake video assuming the video contains the person who was swapped instead of the original speaker. And the second part of experiments shows the Deepfake detectors, best of which is based on image quality features and SVM classifier.

Before running the experiments, please set the path to DeepfakeTIMIT database inside file path_to_data.txt like so:

  • [SAVI_DATA_DIRECTORY]=/path/to/the/folder/where/DeepfakeTIMIT/is

Also, please note that the original VidTIMIT database is also necessary to run the experiments and the assumption is that VidTIMIT will be in the same folder where DeepfakeTIMIT is. So, inside path_to_data.txt please point to the folder with two subfolders: DeepfakeTIMIT and VidTIMIT. Also, note that video data in VidTIMIT is stored as a collection of images, so they should be converted to AVI files before advancing further.

Vulnerability of face recognition

You can modify the parameters of the vulnerability experiments in these config files inside bob.report.deepfakes.config folder: vuln_alg_cosine.py, vuln_db_vdtfaceswap.py, vuln_extr_facenet.py, and vuln_extr_vgg.py. Run the experiments using FaceNet-based face recognition system as follows:

$ bin/verify.py bob/report/deepfakes/config/vuln_db_vdtfaceswap.py bob/report/deepfakes/config/vuln_extr_facenet.py \
bob/report/deepfakes/config/vuln_alg_cosine.py --protocol oneset-spoof
$ bin/verify.py bob/report/deepfakes/config/vuln_db_vdtfaceswap.py bob/report/deepfakes/config/vuln_extr_facenet.py \
bob/report/deepfakes/config/vuln_alg_cosine.py --protocol oneset-licit

The scores for licit (normal face recognition without Deepfakes) and spoof (when Deepfake videos are used as probes) scenarios will be located inside ./results/facenet-cosine folder. Measure vulnerability of the system by running the following:

$ bob bio metrics -e ./results/facenet-cosine/oneset-licit/nonorm/scores-dev \
./results/facenet-cosine/oneset-spoof/nonorm/scores-dev

This command will take scores from licit scenario as a development scores and determine from them the threshold to separate genuine scores from zero effort impostors. Then, apply this threshold to the spoof scenario scores, where the two sets are genuine scores and scores from Deepfakes. The Half Total Error Rate metric (it’s equal error rate in this case) should be 0.0% for the Development column (the licit scenario) and 45.3% (FAR is 90.6%) for Evaluation column.

To run the experiments for VGG-based face recognition, instead of vuln_extr_facenet.py config, use vuln_extr_vgg.py when running bin/verify.py script.

To run the experiments for lower qualitu Deepfakes, inside vuln_db_vdtfaceswap.py replace the line db_name=’deepfaketimit_hq’ with the line db_name=’deepfaketimit_lq’.

Detection with image quality features

You can modify the parameters of the Deepfake detection experiments in these config files inside bob.report.deepfakes.config folder: pad_db_vdtfaceswap.py, pad_extr_iqm_alg_pca_lda.py, pad_extr_iqm_alg_svm.py, and pad_extr_lin_alg_pca_lda.py. To run an IQM-SVM-based Deepfake detector, run the following:

$ bin/spoof.py bob/report/deepfakes/config/pad_db_vdtfaceswap.py bob/report/deepfakes/config/pad_extr_iqm_alg_svm.py

This script will produce the scores and place them inside ./results/pad-iqm-svm folder. To measure the detection accuracy of the Deepfakes, please run the following command:

$ bob pad metrics -e ./results/pad-iqm-svm/train_dev/scores/scores-train pad-iqm-svm/train_dev/scores/scores-dev

This command will take the scores computed for the training set (the corresponding score column is called Development because we only have two sets for this database), determine the threshold for them and use the threshold to compute the metrics for the development set (but the corresponding column from bob pad metrics is Evaluation). The resulted HTER metric should be 0% for the first set (Training/Development scores) and 12.4% for Development/Evaluation scores.