Towards Integrated processing of Physiological and Speech signals

Speech processing research has largely focused on modeling the activity of voice source and the activities in the oral cavity. However, speech production is intrinsically related to other physiological activities, such as respiration, heart activity, which can undergo changes due to a variety of reasons like mood or emotion, social or environmental setting (e.g., loud versus quite environment), or neuro-degenerative diseases (e.g., Parkinson's disease) or stroke. These changes in turn can affect speech communication. In the speech community, there is very little research that has undergone to understand the relationship between speech and physiological activities, such as respiration, heart rate. Idiap and CSEM are currently involved in developing a platform, where speech and physiological signals are collected in a synchronous manner through a wearable cooperative sensor and processed to develop novel speech- and physiology-based applications. Building on the outcomes of this collaboration, the proposed project TIPS aims to investigate the relationship between speech signals and physiological signals under different speaking conditions (read, spontaneous, group conversation, public speaking and speech under cognitive stress), and develop methods, 1. to predict physiological parameters from the speech signal 2. to improve mental stress detection by combining speech and physiological information 3. to robustly segment utterances in terms of words and phrases by jointly modeling speech and physiological signals. The outcomes of the proposed project is of interest not only to the speech community but also to other fields, such as social computing and health care to name a prominent few. The proposed research will be carried out in collaboration with CSEM, who will assist with their expertise in sensor development and physiological signal acquisition. We will also collaborate with an executive coach and expert on spoken word, who will help us in developing methods to detect mental stress in public speaking. Finally, we will also collaborate with the German Aerospace Center (DLR), Braunschweig to acquire speech and physiological data under cognitive stress conditions, specifically air traffic controllers data, and develop novel speech and physiology-based cognitive stress measurement methods. TIPS will fund two young researchers, one PhD and one Postdoc, and will train them at the interaction of speech processing, physiological signal processing, sensor fusion and machine learning.
Coaching & Moderation, Centre Suisse d'Electronique et de Microtechnique
Swiss National Science Foundation
Nov 01, 2019
Oct 31, 2023