Audio Inference

Bayesian inference and learning applied to audio, speech and language.

Vision

We work in the general area of audio processing, but particularly when the audio signal contains speech. Our capabilities are centred around building models of production and perception that allow machines to mimic the capabilities of humans. Our models connect the audio signal with higher level semantics such as speech, language, affect and intent.

We embrace rigorous mathematical and Bayesian techniques to infer model parameters as well as the semantics required by the application. More speculatively, we also use such techniques to allow our models to make inference about the human production, perception and cognitive systems on which they are based.

The group title abbreviates to AI, emphasising a deep connection to the wider Artificial Intelligence field of our host institute. By the audio and speech focus we contribute to Idiap’s core Human-AI Teaming program; by the inference of biological function we aim to also contribute to the AI for Life program. In each case we benefit from colleagues with complementary skills, and hope to assist them in their own endeavours.