Flexible Linguistically-guided Objective Speech aSessment

Speech assessment is crucial for development of speech technologies such as speech transmission systems, speech synthesis system, assistive technologies for impaired speech, language learning to name a prominent a few. Traditionally speech assessment is carried out using human subjects. This is expensive in terms cost and time, and may not be reproducible. As a consequence there is thrust towards development of objective speech assessment techniques. In the literature, many objective speech assessment techniques have largely emerged through speech transmission research. These methods typically are based on spectral based methods incorporating human hearing knowledge, and are not easily scalable to other speech assessment tasks such as, synthetic speech assessment, pathological speech assessment. Building upon recent works at Idiap on objective speech intelligibility assessment and assessment of accentedness of non-native speech carried out through industrial research, FLOSS aims to develop an unified framework for objective speech intelligibility and quality assessment through integration of linguistic knowledge and novel machine learning approaches. Specifically, FLOSS will 1. first develop a framework for objective intelligibility assessment that scales across multiple languages and speech types (e.g., noisy, pathological, whispered); and 2. then build on it to develop a framework where both intelligibility aspects and quality aspects such as naturalness, emotion, non-native accent can be jointly modeled and assessed. The research will focus on assessment of telephone speech, synthetic speech, non-native speech, emotional speech and pathological speech.

Hasler Stiftung (Hasler Foundation)
Mar 01, 2017
Feb 29, 2020