Voxforge Database Verification Protocols

Speaker recognition protocol on the Voxforge Database

Voxforge offers a collection transcribed speech for use with Free and Open Source Speech Recognition Engines. In this package, we design a speaker recognition protocol that uses a small subset of the english audio files (only 6561 files) belonging to 30 speakers randomly selected. This subset is split into three equivalent parts: Training (10 speakers), Development (10 speakers) and Test (10 speakers) sets.

This package serves as a toy example of speaker recognition database while testing bob.bio.spear.

bob.bio.spear is developed at Idiap during its participation to the NIST SRE 2012 evaluation. If you use this package and/or its results, please cite the following publications:

  1. The original paper presented at the NIST SRE 2012 workshop:

    @inproceedings{Khoury_NISTSRE_2012,
      author = {Khoury, Elie and El Shafey, Laurent and Marcel, S\'ebastien},
      title = {The Idiap Speaker Recognition Evaluation System at NIST SRE 2012},
      booktitle = {NIST Speaker Recognition Conference},
      year = {2012},
      month = dec,
      location = {Orlando, USA},
      organization = {NIST},
      pdf = {http://publications.idiap.ch/downloads/papers/2012/Khoury_NISTSRE_2012.pdf}
    }
    
  2. Bob as the core framework used to run the experiments:

     @inproceedings{Anjos_ACMMM_2012,
       author = {Anjos, Andr\'e and El Shafey, Laurent and Wallace, Roy and G\"unther, Manuel and McCool, Christopher and Marcel, S\'ebastien},
       title = {Bob: a free signal processing and machine learning toolbox for researchers},
       year = {2012},
       month = oct,
       booktitle = {20th ACM Conference on Multimedia Systems (ACMMM), Nara, Japan},
       publisher = {ACM Press},
       pdf = {http://publications.idiap.ch/downloads/papers/2012/Anjos_Bob_ACMMM12.pdf}
    }
    

Getting the data

The original data can be downloaded directly from Voxforge, or by running download_and_untar_voxforge.py which takes as input the path in which the data will be stored (using VOXFORGE_DATABSE as default):

$ download_and_untar_voxforge.py --address PATH/TO/WAV/DIRECTORY

Note

Running this script requires this package to be installed. If you are using an installation strategy (such as pip), the directory, where the script is placed, might differ.

Indices and tables