.. vim: set fileencoding=utf-8 : .. Pavel Korshunov .. Thu 12 Jul 13:43:22 2018 .. _bob.paper.eusipco2018: ===================================================================================== Documentation for EUSIPCO paper "Speaker Inconsistency Detection in Tampered Video" ===================================================================================== Creating databases ------------------ So far, the algorithms in this package can be ran on two sets of data: generated from AMI_ and VidTIMIT_ databases. Both databases need to be downloaded first. We used the original databases to create corresponding sets of genuine and tampered videos, as well as, evaluation protocols. VidTIMIT_ database ~~~~~~~~~~~~~~~~~~ From the images in VidTIMIT_, we generate video files for genuine subset and we down-sample audio file to 16bits. We also generate our own tampered videos (so far, for each video we replace the speech from 5 other random speakers). Here are the steps on how to generate the datasets: * Provided you have VidTIMIT_ database downloaded to `/path/to/vidtimit`, generate genuine audio-video: .. code:: sh $ bin/python bob/paper/eusipco2018/scripts_vidtimit/generate_non-tampered-audio.py -d /path/to/vidtimit/audio -o /output/dir/where/genuine/files/will/be $ bin/python bob/paper/eusipco2018/scripts_vidtimit/generate_non-tampered-video.py -d /path/to/vidtimit/video -o /output/dir/where/genuine/files/will/be * Generate tampered video (5 tampered for each genuine): .. code:: sh $ bin/python bin/python bob/paper/eusipco2018/scripts_vidtimit/generate_tampered.py -d /dir/where/genuine/files/are/ -o /dir/where/tampered/files/will/be/ -t 5 This script, for each genuine video will take randomly audio from 5 other people and create audio file with the same name, thus creating 5 audio-video pairs where lip movements do not match the speech. * Run face and landmark detection - preprocess videos (this step is specific to the SGE grid at Idiap) .. code:: sh $ cd bob/paper/eusipco2018/job $ bash submit_cpm_detection.sh $(find /dir/where/genuine/files/are -name '*.avi') $ bash watch_jobs.sh /dir/where/genuine/files/are * Move found detections to genuine and tampered directories: .. code:: sh $ rsync --chmod=0777 -avm --include='*.hdf5' -f 'hide,! */' /dir/where/genuine/files/are/ /dir/where/genuine/files/are/ $ bin/python bob/paper/eusipco2018/scripts_vidtimit/reallocate_annotation_files.py -a /dir/where/genuine/files/are -o /dir/where/tampered/files/are AMI_ database ~~~~~~~~~~~~~ Since AMI_ has a lot of different types of videos that are not very suitable for lip-sync detection, we need to extract a suitable set of videos (a single person in the video frame speaking). Using the annotation files provided in `project/savi/data/ami_annotations/` folder, we cut 15-40 seconds videos from the single speaker shots and use the audio recorded with lapel mic. To generate training and development data from AMI_, follow these steps: * Provided you have AMI_ database downloaded to `/path/to/ami`, you can generate genuine videos by running the following script: .. code:: sh $ bin/python bob/paper/eusipco2018/scripts_amicorpus/generate_non-tampered.py -d /path/to/ami -a bob/paper/eusipco2018/data/ami_annotations/p1.trn.mdtm -o /output/dir/where/genuine/files/will/be * Generate tampered video (5 tampered for each genuine) set by running the following: .. code:: sh $ bin/python bob/paper/eusipco2018/scripts_amicorpus/generate_tampered.py -d /path/to/ami/genuine/videos -o /output/dir/where/tampered/files/will/be -t 5 This script, for each genuine video will take randomly audio from 5 other people and merge it with this video, thus creating 5 tampered videos where lip movements do not match the speech. * Split video and audio in different files (run for both genuine and tampered directories): .. code:: sh $ bin/python bob/paper/eusipco2018/scripts_amicorpus/bin/extract_audio_from_video.py -d /path/to/ami/videos -o /path/to/ami/videos -p /path/to/ami/videos/ * The rest of the processing is the same as for VidTIMIT_ Step-by-step instructions for reproducing the experiments --------------------------------------------------------- For face and landmark detection, please refer to this README_ (note that although most of the steps could be replicated on a local machine the readme is written with SGE grid in mind and a support infrastructure available at Idiap_). Before training models, video and audio features need to be preprocessed and extracted. First, preprocess video: .. code:: sh $ bin/train_gmm.py bob/paper/eusipco2018/config/video_extraction_pipeline.py -P oneset-licit -s mfcc20mouthdeltas $ bin/train_gmm.py bob/paper/eusipco2018/config/video_extraction_pipeline.py -P oneset-spoof -s mfcc20mouthdeltas Then, use audio pipeline to extract audio features, (video features should be ready by then) and train models (here we are using GMMs as example of the classifiers): .. code:: sh $ bin/train_gmm.py bob/paper/eusipco2018/config/audio_extraction_pipeline.py -P oneset-licit -s mfcc20mouthdeltas --projector-file Projector_gmm_mfcc20_mouthdeltas_licit.hdf5 $ bin/train_gmm.py bob/paper/eusipco2018/config/audio_extraction_pipeline.py -P oneset-spoof -s mfcc20mouthdeltas --projector-file Projector_gmm_mfcc20_mouthdeltas_spoof.hdf5 Test the models and compute scores: .. code:: sh $ bin/spoof.py bob/paper/eusipco2018/config/audio_extraction_pipeline.py -P train_dev -a gmm -s mfcc20mouthdeltas --projector-file Projector_gmm_mfcc20_mouthdeltas_spoof.hdf5 =========== Users Guide =========== .. toctree:: :maxdepth: 2 guide Contact ------- For questions or reporting issues to this software package, contact Pavel Korshunov (pavel.korshunov@idiap.ch). .. Place your references here: .. _bob: https://www.idiap.ch/software/bob .. _installation: https://www.idiap.ch/software/bob/install .. _mailing list: https://www.idiap.ch/software/bob/discuss .. _algorithm: https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation .. _dlib: http://dlib.net/ .. _AMI: http://groups.inf.ed.ac.uk/ami/download/ .. _README: https://gitlab.mediforprogram.com/savi/lip.sync/tree/master/bob/paper/eusipco2018/job .. _idiap: https://www.idiap.ch .. _VidTIMIT: http://conradsanderson.id.au/vidtimit