XCSMAD (eXtended Custom Silicone Mask Attack Dataset)

535 short video recordings of both bona fide and presentation attacks (PA) from 72 subjects. The attacks have been created from custom silicone masks.

Get Data

Dataset Description

The eXtended Custom Silicone Mask Attack Dataset (XCSMAD) consists of presentation attacks constructed from 21 custom silicone masks corresponding to 17 subjects. The dataset has been created for experiments related to detection of mask attacks on face recognition systems.

The XCSMAD dataset includes of 240 bona fide and 295 presentation attack (PA) videos (each ≈ 10 s in duration). These videos have been acquired in different channels- RGB, near infrared (NIR), and thermal (LWIR). Two cameras with different specifications (pertaining to quality and resolution of capture) have been used for thermal channel recordings.

The description of recording instruments is as follows:

Imaging Channel Sensor Resolution
RGB (VIS) Intel RealSense SR300 1920 x 1080
Near Infrared (NIR) Intel RealSense SR300 640 x 480
Thermal (TLQ) Seek Thermal Compact PRO 320 x 240
Thermal (THQ) Xenics Gobi-640-GigE 640 x 480

 The XCSMAD dataset is a subset of WMCA dataset collected at Idiap Research Institute. For details on WMCA dataset, please refer: https://www.idiap.ch/dataset/wmca

A complete preprocessed data for the aforementioned videos and bona fide images (as a part of experiments related to vulnerability assessment) have been provided to facilitate reproducing experiments from the reference publication, as well as to conduct new experiments. The details of preprocessing can be found in the reference publication.

The implementation of all experiments described in the reference publication is available at https://gitlab.idiap.ch/bob/bob.paper.xcsmad_facepad

Experimental Protocols

The reference publication considers two experimental protocols: grandtest and cross-validation (cv). For a frame-level evaluation, 50 frames from each video have been used in both protocols. For the grandtest protocol, videos were divided into train, dev, and eval groups. Each group consists of unique subset of clients. (The videos corresponding to any specific subjects in one group are a part of single group).

For cross-validation (cv) experiments, a 5-fold protocol has been devised. Videos from XCSMAD have been split into 5 folds with non-overlapping clients. Using these five partitions, 5 testprotocols (cv0, · · · , cv4) have been created such that in each protocol, four of the partitions are used for training, and the remaining one is used for evaluation.

Details of both protocols are summarized below:

Details of grandtest protocol:

Partition #Videos #Frames Split Ratio (%) Total Frames
train bona fide 86 4300 47.52 9050
train PA 95 4750 52.48
dev bona fide 80 4000 41.03 9750
dev PA 115 5750 58.97
eval bona fide 74 3700 46.54 7950
eval PA 85 4250 53.46
Total 535 26750   26750

Details of cv protocols:

Protocol #train Videos [bona fide, PA] #eval Videos [bona fide, PA]
cv0 409 [182, 227] 126 [58, 68]
cv1 410 [188, 222] 125 [52, 73]
cv2 433 [194, 239] 102 [46, 56]
cv3 454 [202, 252] 081 [38, 43]
cv4 434 [194, 240] 101 [46, 55]


If you use this dataset, please cite the following publication:

	author = {Kotwal, Ketan and Bhattacharjee, Sushil and Marcel, S\'{e}bastien},
	title = {Multispectral Deep Embeddings As a Countermeasure To Custom Silicone Mask Presentation Attacks},
	journal = {IEEE Transactions on Biometrics, Behavior, and Identity Science},
	publisher = {{IEEE}},
	year = {2019},