Vanilla PAD features

Most of the available features are equivalent to the ones defined in Vanilla Biometrics: Advanced features. However, there are some variations, and those are presented below.

Database interface

The database interface definition follows closely the one in The database interface. However, the naming of the methods to retrieve data is different:

  • database.fit_samples() returns the samples (or delayed samples) used to train the classifier;

  • database.predict_samples() returns the samples that will be used for evaluating the system. This is where the group (dev or eval) is specified.

A difference with the bob.bio.base database interface is the presence of an attack_type annotation. It stores the type of PAI to allow the scoring of each different type of attack separately.

File list interface

A class with those methods returning the corresponding data can be implemented for each dataset, but an easier way to do it is with the file list interface. This allows the creation of multiple protocols and various groups by editing some CSV files.

The dataset configuration file will then be as simple as:

from bob.pad.base.database import CSVPADDataset

database = CSVPADDataset("path/to/my_dataset", "my_protocol")

And the command to run an experiment with that configuration on the svm-frames pipeline:

$ bob pad vanilla-pad -d my_db_config_file.py svm-frames -o output_dir

The files must follow the following structure and naming:

my_dataset
|
+-- my_protocol
    |
    +-- train
    |   |
    |   +-- for_real.csv
    |   +-- for_attack.csv
    |
    +-- dev
    |   |
    |   +-- for_real.csv
    |   +-- for_attack.csv
    |
    +-- eval
        |
        +-- for_real.csv
        +-- for_attack.csv

The content of the files in the train folder is used when a protocol contains data for training the classifier. The files in the eval folder are optional and are used in case a protocol contains data for evaluation.

These CSV files should contain at least the path to raw data and an identifier to the identity of the subject in the image (subject field). The structure of each CSV file should be as below:

PATH,SUBJECT
path_1,subject_1
path_2,subject_2
path_i,subject_j
...

Metadata can be shipped within the Samples (e.g gender, age, annotations, …) by adding a column in the CSV file for each metadata:

PATH,SUBJECT,TYPE_OF_ATTACK,GENDER,AGE
path_1,subject_1,A,B,C
path_2,subject_2,A,B,1
path_i,subject_j,2,3,4
...

Checkpoints and Dask

In the same way as in bob.bio.base, it is possible to activate the checkpointing of experiments by passing the -c (--checkpoint) option in the command line.

The Dask integration can also be used by giving a client configuration to the -l (--dask-client) argument. Basic Idiap SGE configurations are defined by bob.pipelines: sge and sge-gpu:

$ bob pad vanilla-pad replay-attack svm-frames -o output_dir -l sge -c