Speaker Diarization Toolkit
The toolkit is intended to facilitate research in multistream speaker diarization providing a platform for research in novel audio, video or location features. It is based on the Information Bottleneck principle and is explicitely designed to use of several hetergenous feature streams.
Scientific papers [1,2,3] refer to results obtained on meeting recordings. The original formulation was proposed in [1] based on acoustic (MFCC) information only. Later the approach was exthended to include also MFCC and DOA features [2] and in [3] the IB was applied to the combination of four different feature streams.
References:
[1]]An Information Theoretic Approach to Speaker Diarization of Meeting Data, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: IEEE Transactions on Audio Speech and Language Processing, 17(7), 2009
[2] An Information Theoretic Combination of MFCC and TDOA Features for Speaker Diarization, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: IEEE Transactions on Audio Speech and Language Processing, 19(2), 2011
[3] Multistream speaker diarization of meetings recordings beyond MFCC and TDOA features, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: Speech Communication, 54(1), 2012

