Tools implemented in bob.bio.spear

Summary

Databases

bob.bio.spear.database.SpearBioDatabase(name)

Database interface for the bob.bio.spear datasets for speaker recognition.

Speech Annotators (VAD)

bob.bio.spear.annotator.Energy_2Gauss([...])

Detects the Voice Activity using the Energy of the signal and 2 Gaussian GMM.

bob.bio.spear.annotator.Energy_Thr([...])

VAD based on an energy threshold

bob.bio.spear.annotator.Mod_4Hz([...])

VAD based on the modulation of the energy around 4 Hz and the energy

Voice Feature Extractors

bob.bio.spear.extractor.Cepstral([...])

Extracts the Cepstral features of audio wav data.

Databases

bob.bio.spear.database.SpearBioDatabase(name: str, protocol: Optional[str] = None, dataset_protocol_path: Optional[str] = None, data_path: Optional[str] = None, data_ext: str = '.wav', annotations_path: Optional[str] = None, annotations_ext: str = '.json', force_sample_rate: Optional[int] = None, force_channel: Optional[int] = None, **kwargs)

Database interface for the bob.bio.spear datasets for speaker recognition.

This database interface is meant to be used with bob.bio.base pipelines.

Given a series of CSV files (or downloading them from the bob data server), it creates the Sample objects for each roles needed by the pipeline (enroll, probe), for different groups (train, dev, eval).

Each sample contains:

  • data: the wav audio data,

  • rate: the sample rate of data,

  • (optional)`annotations`: some annotations loaded from files if annotations_path is provided.

protocol definition files (CSV files) are not the data files (WAV files):

  • protocol definition files are a list of paths and corresponding reference name. They are available on the bob data server.

  • data files are the actual files of the dataset (pointed to by the definition files). They are not provided by bob.

You have to set the bob configuration to the root folder of the data files using the following command:

$ bob config set bob.db.<database_name>.directory <your_path_to_data>

The final data paths will be constructed with the bob.db.<database_name>.directory key, and the paths in the CSV protocol definition files.

Parameters
  • name – name of the database used for retrieving config keys and files.

  • protocol – protocol to use (sub-folder containing the protocol definition files).

  • dataset_protocol_path – Path to an existing protocol definition folder structure. If None: will download the definition files to a datasets folder in the path pointed by the bob_data_folder config (see bob.extension.download.get_file()).

  • data_path – Path to the data files of the database. If None: will use the path in the bob.db.<database_name>.directory config.

  • data_ext – File extension of the data files.

  • annotations_path – Path to the annotations files of the dataset, if available. If None: will not load any annotations (you could then annotate on the fly with a transformer).

  • annotations_ext – If annotations_path is provided, will load annotation using this extension.

  • force_sample_rate – If not None, will force the sample rate of the data to a specific value. Otherwise the sample rate will be specified by each loaded file.

  • force_channel – If not None, will force to load the nth channel of each file. If None and the samples have a channel attribute, this channel will be loaded, and otherwise all channels will be loaded in a 2D array if multiple are present.