Connecting legacy bob.bio.base to Vanilla Biometrics

The transition to the pipeline concept changed the way data goes from the raw sample to the extracted features, and how the biometric algorithm is applied. However, a set of tools was implemented to support the older bob implementations (designated as legacy) of database, preprocessor, extractor, and algorithms.

This adaptation consists of wrapper classes that take a legacy bob class as input and constructs a Transformer or BiometricAlgorithm out of it.

Warning

A temporary folder is created in case a legacy bob package needs to write on disk during its operation. However, this folder is persistent between experiments. You should remove its content before running another experiment.

Legacy FileList Database interface

This is a similar database interface to the CSV file interface, but takes information from a series of two- or three-column files without header instead of CSV files and returns a legacy database (use a Database Connector to create a database interface).

The files are separated into three sets: 'world' (training; optional), 'dev' (development; required) and 'eval' (evaluation; optional) set to be used by the biometric verification algorithm. The summarized complete structure of the list base directory (here denoted as basedir) containing all the files should be like this:

filelists_directory
|
+-- norm
|   |
|   +-- train_world.lst
|   +-- train_optional_world_1.lst
|   +-- train_optional_world_2.lst
|
+-- dev
|   |
|   +-- for_models.lst
|   +-- for_probes.lst
|   +-- for_scores.lst
|   +-- for_tnorm.lst
|   +-- for_znorm.lst
|
+-- eval
    |
    +-- for_models.lst
    +-- for_probes.lst
    +-- for_scores.lst
    +-- for_tnorm.lst
    +-- for_znorm.lst

The file lists contain several information that need to be available for the biometric recognition experiment to run properly. A complete list of possible information is:

  • filename: The name of the data file, relative to the common root of all data files, and without file name extension.

  • client_id: The name or ID of the subject the biometric traces of which are contained in the data file. These names are handled as str objects, so 001 is different from 1.

  • model_id:

    • used for model enrollment: The name or ID of the client model that should be enrolled. In most cases, the model_id is identical to the client_id.

    • used for scoring: The name or ID of the client model that the probe file should be compared with.

  • claimed_client_id:

    • used for scoring: The client_id of the client model that the probe file should be compared with.

The following list files need to be created:

  • For training (optional):

    • world file, with default name train_world.lst, in the default sub-directory norm. It is a 2-column file with format:

      filename client_id
      
    • two world files, with default names train_optional_world_1.lst and train_optional_world_2.lst, in default sub-directory norm. The format is the same as for the world file. These files are not needed for most biometric recognition algorithms, hence, they need to be specified only if the algorithm uses them.

  • For enrollment:

    • one or two model files for the development (and evaluation) set, with default name for_models.lst in the default sub-directories dev (and eval). They are 3-column files with format:

      filename model_id client_id
      
  • For scoring:

    There exist two different ways to implement file lists used for scoring.

    • The first (and simpler) variant is to define a file list of probe files, where all probe files will be tested against all models. Hence, you need to specify one (or two) probe files for the development (and evaluation) set, with the default name for_probes.lst in the default sub-directory dev (and eval). They are 2-column files with format:

      filename client_id
      
    • The other option is to specify a detailed list of which probe files should be compared with which client model, i.e., one (or two) score files for the development (and evaluation) set, with the default name for_scores.lst in the sub-directory dev (and eval). These files need to be provided only if the scoring is to be done selectively, meaning by creating a sparse probe/model scoring matrix. They are 4-column files with format:

      filename model_id claimed_client_id client_id
      

    Note

    The verification queries will use either only the probe or only the score files, so only one of them is mandatory. If only one of the two files is available, the scoring technique will be automatically determined. In case both probe and score files are provided, the user should set the parameter use_dense_probe_file_list, which specifies the files to consider, when creating the object of the Database class.

  • For ZT score normalization (optional):

    Optionally, file lists for ZT score normalization can be added. These are:

    • one or two files for t-score normalization for the development (and evaluation) set, with default name for_tnorm.lst in both sub-directories dev (and eval). They are 3-column files with format:

      filename model_id client_id
      
    • one or two files for z-score normalization for the development (and evaluation) set, with default name for_znorm.lst in both sub-directories dev (and eval). They are 2-column files with format:

      filename client_id
      

Note

In all files, the lines starting with any number of white space and # will be ignored.

Legacy Database Connector

This legacy database wrapper is used to translate an old bob.db package functions into a bob pipelines database interface.

It uses objects() to retrieve a list of files for each role (world, references, and probes) and specified group (dev and eval) and creates the matching Sample and SampleSet lists.

This example shows the creation of the Mobio database interface in the bob.pipelines format from the legacy bob.db:

from bob.bio.face.database import MobioBioDatabase
from bob.bio.base.pipelines.vanilla_biometrics import DatabaseConnector
from bob.extension import rc

legacy_database = MobioBioDatabase(
    original_directory=rc["bob.db.mobio.directory"],
    annotation_directory=rc["bob.db.mobio.annotation_directory"],
    original_extension=".png",
    protocol="mobile0-male",
)

# Converts to a Database interface for bob.pipelines
database = DatabaseConnector(legacy_database)

# Sets the optimization flag
database.allow_scoring_with_all_biometric_references = True

Legacy Preprocessor wrapper

The PreprocessorTransformer wrapper takes a :py:class`bob.bio.base.preprocessor` from the old bob.bio.base as input and creates a Transformer out of it. The __call__() method of the :py:class`~bob.bio.base.preprocessor` class is called when the Transformer.transform() method is called.

This example shows how to create a Transformer out of a legacy preprocessor (FaceCrop, from bob.bio.face):

from bob.bio.face.preprocessor import FaceCrop
from bob.bio.base.transformers import PreprocessorTransformer

# Initialize the legacy Preprocessor
legacy_preprocessor = FaceCrop(
    cropped_size=(80,64),
    cropped_positions={'leye':'16,15', 'reye':'16,48'},
    fixed_positions={'leye':'50,24', 'reye','50,64'}
)

# Create the Transformer
preprocessor_transformer = PreprocessorTransformer( legacy_preprocessor )

Legacy Extractor wrapper

A similar wrapper is available for the legacy bob.bio.base Extractor. It is the ExtractorTransformer. It maps the Transformer.transform() method to the __call__() of the legacy Extractor.

Here is an example showing how to create a Transformer from a legacy Extractor (Linearize, from bob.bio.base):

from bob.bio.base.extractor import Linearize
from bob.bio.base.transformers import ExtractorTransformer

# Create the Transformer from the legacy Extractor
extractor_transformer = ExtractorTransformer( Linearize() )

Legacy Algorithm wrappers

Lastly, AlgorithmTransformer and BioAlgorithmLegacy are available to map correctly a legacy Algorithm to a Transformer and a BioAlgorithm.

Those two adaptors are needed as the legacy Algorithm could consist of a projector that could be trainable (with methods project() and train_projector()), which correspond to a Transformer in the new API. The enrollment and scoring of the legacy algorithm were done using the enroll() and score() methods, which can be mapped to the same methods in a BioAlgorithm.

Here is an example showing how to create the Transformer out of a bob.bio.base Algorithm (Distance):

from bob.bio.base.algorithm import Distance
from bob.bio.base.transformers import AlgorithmTransformer
import scipy.spatial

legacy_algorithm = Distance(
    distance_function = scipy.spatial.distance.cosine,
    is_distance_function = True
)

# Create the BioAlgorithm from the legacy Algorithm
algorithm_transformer = AlgorithmTransformer( legacy_algorithm )

And here is an example of the creation of the BioAlgorithm from the bob.bio.base Algorithm (Distance) with the BioAlgorithmLegacy. This will map correctly the enroll() and score() methods:

from bob.bio.base.algorithm import Distance
from bob.bio.base.pipelines.vanilla_biometrics.legacy import BioAlgorithmLegacy
import scipy.spatial

legacy_algorithm = Distance(
    distance_function = scipy.spatial.distance.cosine,
    is_distance_function = True
)

bio_algorithm = BioAlgorithmLegacy(legacy_algorithm)