Odessa

Odessa dataset contains 42 short conversations via VoIP, in pairs of speakers among 14 individuals.

This database has been recorded during the project ODESSA.

It contains 42 short conversations via VoIP, in pairs of speakers among 14 individuals.

The scenarion was as follows:

Pre-prepared scripts are given to the speakers, they will start reading their assigned roles. Each of the speakers is using a PC. The session animator is using a third PC for recording while muting himself.

This database is for testing diarization algorithms, it is considered somewhat clean. All the audio files are in .wav format.