User’s Guide for PAD File List API

The Database Interface

The bob.pad.db.PadFileListDatabase complies with the standard PAD database as described in Running Presentation Attack Detection Experiments. All functions defined in that interface are properly instantiated, as soon as the user provides the required file lists.

Creating File Lists

The initial step for using this package is to provide file lists specifying the 'train' (training), 'dev' (development) and 'eval' (evaluation) sets to be used by the PAD algorithm. The summarized complete structure of the list base directory (here denoted as basedir) containing all the files should be like this:

basedir -- train -- for_real.lst
       |       |-- for_attack.lst
       |
       |-- dev -- for_real.lst
       |      |-- for_attack.lst
       |
       |-- eval -- for_real.lst
               |-- for_attack.lst

The file lists should contain the following information for PAD experiments to run properly:

  • filename: The name of the data file, relative to the common root of all data files, and without file name extension.
  • client_id: The name or ID of the subject the biometric traces of which are contained in the data file. These names are handled as str objects, so 001 is different from 1.
  • attack_type: This is not contained in for_real.lst files, only in for_attack.lst files. The type of attack (str object).

The following list files need to be created:

  • For real:

    • real file, with default name for_real.lst, in the default sub-directories train, dev and eval, respectively. It is a 2-column file with format:

      filename client_id
      
    • attack file, with default name for_attack.lst, in the default sub-directories train, dev and eval, respectively. It is a 3-column file with format:

      filename client_id attack_type
      

Note

If the database does not provide an evaluation set, the eval files can be omitted.

Protocols and File Lists

When you instantiate a database, you have to specify the base directory that contains the file lists. If you have only a single protocol, you could specify the full path to the file lists described above as follows:

>>> db = bob.pad.db.PadFileListDatabase('basedir/protocol')

Next, you should query the data, WITHOUT specifying any protocol:

>>> db.objects()

Alternatively, if you have more protocols, you could do the following:

>>> db = bob.pad.db.PadFileListDatabase('basedir')
>>> db.objects(protocol='protocol')

When a protocol is specified, it is appended to the base directory that contains the file lists. This allows to use several protocols that are stored in the same base directory, without the need to instantiate a new database. For instance, given two protocols ‘P1’ and ‘P2’ (with filelists contained in ‘basedir/P1’ and ‘basedir/P2’, respectively), the following would work:

>>> db = bob.pad.db.PadFileListDatabase('basedir')
>>> db.objects(protocol='P1') # Get the objects for the protocol P1
>>> db.objects(protocol='P2') # Get the objects for the protocol P2