Biometric Scores 2014 (BIOSCOTE 2014)

This dataset contains raw scores in plain text format of several biometric (face and speaker) recognition systems applied on several databases.

Get Data

This dataset contains raw biometric (face and speaker) scores that were used to generate the plots reported in the following Ph.D.:

@phdthesis{ElShafey_EPFL2014,
    title = {Scalable Probabilistic Models for Face and Speaker Recognition},
    author = {Laurent El Shafey},
    month = {April},
    year = {2014},
    school = {Ecole Polytechnique F{\'e}d{\'e}rale de Lausanne (EPFL)},
    url = {http://publications.idiap.ch/index.php/publications/show/2830},
  }

Full description

This dataset contains raw scores in plain text format of several biometric (face and speaker) recognition systems applied on several databases.

The biometric recognition systems are described in the aforementioned manuscript and encompasses Gaussian mixture models, inter-session variability modeling, joint factor analysis and probabilistic linear discriminant analysis.

The databases considered are the following ones:

BANCA

AR face database

Face Recognition Grand Challenge version 2

The Good, The Bad and the Ugly

Labeled Faces in the Wild

Multi-PIE

MOBIO

CAS-PEAL

NIST Speaker Recognition Evaluation 2012

Evaluation protocols (inclusive verification and identification ones) considered on these databases are described in the aforementioned manuscript and available through PyPI:

BANCA

AR face database

Face Recognition Grand Challenge version 2

The Good, The Bad and the Ugly

Labeled Faces in the Wild and https://pypi.python.org/pypi/xbob.db.lfwidentification

Multi-PIE

MOBIO

CAS-PEAL

NIST Speaker Recognition Evaluation 2012

These scores allow to replicate easily and quickly the plots of the manuscript by using the following package:
https://pypi.python.org/pypi/xbob.thesis.elshafey2014

Data organization and format:

The score files are organized according to the following directory structure:
$DATABASE_NAME/$SYSTEM_NAME/scores/$PROTOCOL_NAME/$NORM/scores-$SET

where:

$NORM is either 'nonorm' (no score normalization) or 'ztnorm' (ZT-norm score normalization)

$SET is either dev (development set) or eval (evaluation set)

Each score file is plain text with a single trial score per line in a four column format as follows:
model_client_identity probe_client_identity probe_filename score_value

When the first two columns (model_client_identity and probe_client_identity) match, the trial is a true claimant access, otherwise it is an impostor access.

Reference

If you use this dataset in your publication, we would appreciate that you cite the following thesis:

Laurent El Shafey, “Scalable Probabilistic Models for Face and Speaker Recognition”, PhD thesis, 2014.
http://publications.idiap.ch/index.php/publications/show/2830