Personal tools
You are here: Home Dataset bioscote

bioscote: BIOmetric SCOres Thesis Elshafey 2014

This dataset contains raw biometric (face and speaker) scores that were used to generate the plots reported in the following Ph.D.:

    title = {Scalable Probabilistic Models for Face and Speaker Recognition},
    author = {Laurent El Shafey},
    month = {April},
    year = {2014},
    school = {Ecole Polytechnique F{\'e}d{\'e}rale de Lausanne (EPFL)},
    url = {},

Full description:

This dataset contains raw scores in plain text format of several biometric (face and speaker) recognition systems applied on several databases.

The biometric recognition systems are described in the aforementioned manuscript and encompasses Gaussian mixture models, inter-session variability modeling, joint factor analysis and probabilistic linear discriminant analysis.

The databases considered are the following ones:

Evaluation protocols (inclusive verification and identification ones) considered on these databases are described in the aforementioned manuscript and available through PyPI:

These scores allow to replicate easily and quickly the plots of the manuscript by using the following package:

Data organization and format:

The score files are organized according to the following directory structure:


  • $NORM is either 'nonorm' (no score normalization) or 'ztnorm' (ZT-norm score normalization)
  • $SET is either dev (development set) or eval (evaluation set)

Each score file is plain text with a single trial score per line in a four column format as follows:
model_client_identity probe_client_identity probe_filename score_value

When the first two columns (model_client_identity and probe_client_identity) match, the trial is a true claimant access, otherwise it is an impostor access.


If you use this dataset in your publication, we would appreciate that you cite the following thesis:
Laurent El Shafey, "Scalable Probabilistic Models for Face and Speaker Recognition", Ecole Polytechnique Fédérale de Lausanne (EPFL), 2014.

Size of the dataset:

total 80G

56M       arface.tar.gz          320 files

3.5M      banca.tar.gz           112 files

3.1G      caspeal.tar.gz        198 files

11G       frgc.tar.gz               27 files

1.5G      gbu.tar.gz               102 files

1.5G      lfw.tar.gz                422 files

995M    mobio.tar.gz            1296 files

1.4G      multipie.tar.gz        920 files

61G      nist_sre12.tar.gz    168 files