Audio Inference
Vision
We work in the general area of audio processing, but particularly when the audio signal contains speech. Our capabilities are centred around building models of production and perception that allow machines to mimic the capabilities of humans. Our models connect the audio signal with higher level semantics such as speech, language, affect and intent.
We embrace rigorous mathematical and Bayesian techniques to infer model parameters as well as the semantics required by the application. More speculatively, we also use such techniques to allow our models to make inference about the human production, perception and cognitive systems on which they are based.
The group title abbreviates to AI, emphasising a deep connection to the wider Artificial Intelligence field of our host institute. By the audio and speech focus we contribute to Idiap’s core Human-AI Teaming program; by the inference of biological function we aim to also contribute to the AI for Life program. In each case we benefit from colleagues with complementary skills, and hope to assist them in their own endeavours.
Current Group Members
GARNER, Philip
(Senior Research Scientist)
- website
AKSTINAITE, Vita
(Postdoctoral Researcher)
- website
CHEN, Haolin
(PhD Student / Research Assistant)
- website
HE, Mutian
(PhD Student / Research Assistant)
- website
COPPIETERS DE GIBSON, Louise
(PhD Student / Research Assistant)
- website
Alumni
Current Projects
Recent Projects
- ADEL - Automatic Detection of Leadership from Voice and Body
- DAHL - DAHL: Domain Adaptation via Hierarchical Lexicons
- DEEPCHARISMA - Deep Learning Charisma
- EVOLANG - Evolving Language
- L-PASS - Linguistic-Paralinguistic Speech Synthesis
- MASS - Multilingual Affective Speech Synthesis
- NAST - Neural Architectures for Speech Technology
- NATAI - The Nature of Artificial Intelligence
- NMTBENCHMARK - Training and Benchmarking Neural MT and ASR Systems for Swiss Languages
- SIWIS - Spoken Interaction with Interpretation in Switzerland
- SP2 - SCOPES Project on Speech Prosody
- V-FAST - Vocal-tract based Fast Adaptation for Speech Technology