Research Group

Speech & Audio Processing Group

Head: Prof. Hervé Bourlard

Visit group webpage


Speech processing has been one of the mainstays of Idiap’s research portfolio for many years. Today it is still the largest group in the institute, and Idiap continues to be recognized as a leader in the field. The expertise of the group encompasses statistical automatic speech recognition (based on hidden Markov models or hybrid systems exploiting connectionist approaches), text-to-speech, and generic audio processing (covering sound source localization, microphone arrays, speaker diarization, audio indexing, very-low-bit-rate speech coding, and perceptual background noise analysis for telecommunication systems).

Social Computing Group

Head: Prof. Daniel Gatica-Perez

Visit group webpage


Social computing is an interdisciplinary domain that integrates theories and models from mobile and ubiquitous computing, multimedia, machine learning, and social sciences in order to sense, analyze, and interpret human and social behavior in daily life, and to create devices and systems that support interaction and communication. Current lines of research include ubiquitous sensing of face-to-face interaction, behavioral analysis of social video, crowdsourcing, and urban datamining using smartphones and mobile social networks.


Machine Learning Group

Head: Dr. François Fleuret

Visit group webpage


The goal of our group is the development of new machine learning techniques, with a particular interest in their computational properties. Our application domain is mainly computer vision and includes object detection, scene analysis, tracking of persons and biological structures, and image recognition in general.

Perception & Activity Understanding Group

Head: Dr. Jean-Marc Odobez

Visit group webpage


This group conducts research into human–human activity analysis using multimodal data. This entails the investigation of fundamental tasks such as the representation, detection, segmentation, and tracking of objects and people, the characterization of their state, and the modeling of sequential data and the interpretation of that data in the form of gestures, activities, behavior, or social relationships. These investigations take place through the design of principled algorithms that extend models from computer vision, statistical learning, or multimodal signal processing. Surveillance, traffic analysis, analysis of behavior, human–robot interfaces, and multimedia content analysis are the main application domains.


Robot Learning & Interaction Group

Head: Dr. Sylvain Calinon

Visit group webpage


The Robot Learning & Interaction group focuses on human-centric robot applications. The scientific objective is to develop probabilistic approaches for encoding movements and behaviors in robots evolving in unconstrained environments. In these applications, the models serve several purposes (recognition, prediction, online synthesis), and are shared by different learning strategies (imitation, emulation, incremental refinement, or exploration). The aim is to facilitate the transfer of skills from end users to robots, or between robots, by exploiting multimodal sensory information and by developing intuitive teaching interfaces.

Uncertainty Quantification and Optimal Design Group

Head: Dr. David Ginsbourger

Visit group webpage


The Uncertainty Quantification and Optimal Design group focuses on quantifying and reducing uncertainties in the context of hi-fidelity models, with a main expertise in Gaussian process methods and the sequential design of computer experiments for optimization, inversion, and related problems. Application domains notably include energy and geosciences, with current collaborations ranging from safety engineering to hydrology and climate sciences.


Natural Language Understanding Group

Head: Dr. James Henderson

Visit group webpage

The Idiap NLU group was created in September 2017 under the direction of James Henderson, in part as a continuation of the previous Natural Language Processing group which was lead by Andrei Popescu-Belis. The NLU group continues work on how semantic and discourse processing of text and dialog can improve statistical machine translation and information indexing, with a recent focus on neural machine translation and attention-based deep learning models. This fits well with the NLU group's new research direction of neural network structured prediction and representation learning for modeling the syntax and semantics of text and speech, including modeling abstraction (textual entailment) and summarization.

Computational Bioimaging Group

Head: Dr. Michael Liebling

Visit group webpage


This group focuses on research into computational imaging and the analysis of biomedical images. This includes developing algorithms for image deconvolution and super-resolution in optical microscopy, three-dimensional tomography reconstruction from projections, and—more generally—combining unusual sensing devices and approaches with computational methods to produce images ideally suited to the observation and quantification of complex and live biological systems.


Biometrics Security and Privacy

Head: Dr. Sébastien Marcel

Visit group webpage


Biometrics refers, in computer science, to the automatic recognition of individuals based on their behavioral and biological characteristics. The Biometric Person Recognition group investigates and develops novel image-processing and pattern-recognition algorithms for face recognition (2-D, 3-D, and near-infrared), speaker recognition, anti-spoofing (attack detection), and emerging biometric modes (EEG and veins). The group is geared toward reproducible research and technology transfer, using its own signal-processing and machine-learning toolbox.