# Voxforge Database Verification Protocols¶

## Speaker recognition protocol on the Voxforge Database¶

Voxforge offers a collection transcribed speech for use with Free and Open Source Speech Recognition Engines. In this package, we design a speaker recognition protocol that uses a small subset of the english audio files (only 6561 files) belonging to 30 speakers randomly selected. This subset is split into three equivalent parts: Training (10 speakers), Development (10 speakers) and Test (10 speakers) sets.

This package serves as a toy example of speaker recognition database while testing bob.bio.spear.

bob.bio.spear is developed at Idiap during its participation to the NIST SRE 2012 evaluation. If you use this package and/or its results, please cite the following publications:

1. The original paper presented at the NIST SRE 2012 workshop:

@inproceedings{Khoury_NISTSRE_2012,
author = {Khoury, Elie and El Shafey, Laurent and Marcel, S\'ebastien},
title = {The Idiap Speaker Recognition Evaluation System at NIST SRE 2012},
booktitle = {NIST Speaker Recognition Conference},
year = {2012},
month = dec,
location = {Orlando, USA},
organization = {NIST},
pdf = {http://publications.idiap.ch/downloads/papers/2012/Khoury_NISTSRE_2012.pdf}
}

2. Bob as the core framework used to run the experiments:

 @inproceedings{Anjos_ACMMM_2012,
author = {Anjos, Andr\'e and El Shafey, Laurent and Wallace, Roy and G\"unther, Manuel and McCool, Christopher and Marcel, S\'ebastien},
title = {Bob: a free signal processing and machine learning toolbox for researchers},
year = {2012},
month = oct,
booktitle = {20th ACM Conference on Multimedia Systems (ACMMM), Nara, Japan},
publisher = {ACM Press},
pdf = {http://publications.idiap.ch/downloads/papers/2012/Anjos_Bob_ACMMM12.pdf}
}


### Getting the data¶

The original data can be downloaded directly from Voxforge, or by running download_and_untar_voxforge.py which takes as input the path in which the data will be stored (using VOXFORGE_DATABSE as default):

\$ download_and_untar_voxforge.py --address PATH/TO/WAV/DIRECTORY


Note

Running this script requires this package to be installed. If you are using an installation strategy (such as pip), the directory, where the script is placed, might differ.