Replay-Attack

The Replay-Attack Database for face spoofing consists of 1300 video clips of photo and video attack attempts to 50 clients, under different lighting conditions. This Database was produced at the Idiap Research Institute, in Switzerland.

Get Data

Spoofing Attacks Description

The 2D face spoofing attack database consists of 1,300 video clips of photo and video attack attempts of 50 clients, under different lighting conditions.

The data is split into 4 sub-groups comprising:

Training data ("train"), to be used for training your anti-spoof classifier;
Development data ("devel"), to be used for threshold estimation;
Test data ("test"), with which to report error figures;
Enrollment data ("enroll"), that can be used to verify spoofing sensitivity on face detection algorithms.

Clients that appear in one of the data sets (train, devel or test) do not appear in any other set.

Database Description

All videos are generated by either having a (real) client trying to access a laptop through a built-in webcam or by displaying a photo or a video recording of the same client for at least 9 seconds. The webcam produces colour videos with a resolution of 320 pixels (width) by 240 pixels (height). The movies were recorded on a Macbook laptop using the QuickTime framework (codec: Motion JPEG) and saved into ".mov" files. The frame rate is about 25 Hz. Besides the native support on Apple computers, these files are *easily* readable using mplayer, ffmpeg or any other video utilities available under Linux or MS Windows systems.

Real client accesses as well as data collected for the attacks are taken under two different lighting conditions:

* **controlled**: The office light was turned on, blinds are down, background is homogeneous;
* **adverse**: Blinds up, more complex background, office lights are out.

To produce the attacks, high-resolution photos and videos from each client were taken under the same conditions as in their authentication sessions, using a Canon PowerShot SX150 IS camera, which records both 12.1 Mpixel photographs and 720p high-definition video clips. The way to perform the attacks can be divided into two subsets: the first subset is composed of videos generated using a stand to hold the client biometry ("fixed"). For the second set, the attacker holds the device used for the attack with their own hands. In total, 20 attack videos were registered for each client, 10 for each of the attacking modes just described:

* 4 x mobile attacks using an iPhone 3GS screen (with resolution 480x320 pixels) displaying:

* 1 x mobile photo/controlled
* 1 x mobile photo/adverse
* 1 x mobile video/controlled
* 1 x mobile video/adverse

* 4 x high-resolution screen attacks using an iPad (first generation, with a screen resolution of 1024x768 pixels) displaying:

* 1 x high-resolution photo/controlled
* 1 x high-resolution photo/adverse
* 1 x high-resolution video/controlled
* 1 x high-resolution video/adverse

* 2 x hard-copy print attacks (produced on a Triumph-Adler DCC 2520 color laser printer) occupying the whole available printing surface on A4 paper for the following samples:

* 1 x high-resolution print of photo/controlled
* 1 x high-resolution print of photo/adverse

The 1300 real-accesses and attacks videos were then divided in the following way:

* **Training set**: contains 60 real-accesses and 300 attacks under different lighting conditions;

* **Development set**: contains 60 real-accesses and 300 attacks under different lighting conditions;

* **Test set**: contains 80 real-accesses and 400 attacks under different lighting conditions;

* **Enrollment set**: contains 100 real-accesses under different lighting conditions, to be used **exclusively** for studying the baseline performance of face recognition systems.

Face Locations

We also provide face locations automatically annotated by a cascade of classifiers based on a variant of Local Binary Patterns (LBP) referred as Modified Census Transform (MCT) [Face Detection with the Modified Census Transform, Froba, B. and Ernst, A., 2004, IEEE International Conference on Automatic Face and Gesture Recognition, pp. 91-96]. The automatic face localisation procedure works in more than 99% of the total number of frames acquired. This means that less than 1% of the total set of frames for all videos do not possess annotated faces. User algorithms must account for this fact.

Protocol for Licit Biometric Transactions

It is possible to measure the performance of baseline face recognition systems on the 2D Face spoofing database and evaluate how well the attacks pass such systems or how, otherwise robust they are to attacks. Here we describe how to use the available data at the enrollment set to create a background model, client models and how to perform scoring using the available data.

1. Universal Background Model (UBM): To generate the UBM, subselect the training-set client videos from the enrollment videos. There should be 2 per client, which means you get 30 videos, each with 375 frames to create the model;

2. Client models: To generate client models, use the enrollment data for clients at the development and test groups. There should be 2 videos per client (one for each light condition) once more. At the end of the enrollment procedure, the development set must have 1 model for each of the 15 clients available in that set. Similarly, for the test set, 1 model for each of the 20 clients available;

3. For a simple baseline verification, generate scores **exhaustively** for all videos from the development and test **real-accesses** respectively, but **without** intermixing accross development and test sets. The scores generated against matched client videos and models (within the subset, i.e. development or test) should be considered true client accesses, while all others impostors;

4. If you are looking for a single number to report on the performance do the following: exclusively using the scores from the development set, tune your baseline face recognition system on the EER of the development set and use this threshold to find the HTER on the test set scores.

Protocols for Spoofing Attacks

Attack protocols are used to evaluate the (binary classification) performance of counter-measures to spoof attacks. The database can be split into 6 different protocols according to the type of device used to generate the attack: print, mobile (phone), high-definition (tablet), photo, video or grand test (all types). Furthermore, subsetting can be achieved on the top of the previous 6 groups by classifying attacks as performed by the attacker bare hands or using a fixed support. This classification scheme makes-up a total of 18 protocols that can be used for studying the performance of counter-measures to 2D face spoofing attacks. The table bellow details the amount of video clips in each protocol.

Distribution of videos per attack-protocol in the 2D Face spoofing database.

	Hand-Attack			Fixed-Support			All Supports
Protocol	train	dev	test	train	dev	test	train	dev	test
Print	30	30	40	30	30	40	60	60	80
Mobile	60	60	80	60	60	80	120	120	160
Highdef	60	60	80	60	60	80	120	120	160
Digitalphoto	60	60	80	60	60	80	120	120	160
Photo	90	90	120	90	90	120	180	180	240
Video	60	60	80	60	60	80	120	120	160
Grandtest	150	150	200	150	150	200	300	300	400

Acknowledgements

If you use this database, please cite the following publication:

I. Chingovska, A. Anjos, S. Marcel,"On the Effectiveness of Local Binary Patterns in Face Anti-spoofing"; IEEE BIOSIG, 2012.
https://ieeexplore.ieee.org/document/6313548
http://publications.idiap.ch/index.php/publications/show/2447