Speech and Audio Processing

Speech processing has been one of the mainstays of Idiap’s research portfolio for many years. Today it is still the largest group within the institute, and Idiap continues to be recognised as a leading proponent in the field. The expertise of the group encompasses statistical automatic speech recognition (based on hidden Markov models, or hybrid systems exploiting connectionist approaches), text-to-speech, and generic audio processing (covering sound source localization, microphone arrays, speaker diarization, audio indexing, very low bit-rate speech coding, and perceptual background noise analysis for telecommunication systems).

Video Presentation

Current Group Members

BOURLARD, Hervé
(Director, EPFL Full Professor)
- website


GARNER, Philip
(Senior Researcher)
- website


MOTLICEK, Petr
(Researcher)
- website


MAGIMAI DOSS, Mathew
(Researcher)
- website


IMSENG, David
(Research Associate)
- website


LAZARIDIS, Alexandros
(Research Associate)
- website


SRINIVASAMURTHY, Ajay
(Postdoctoral Researcher)
- website


WANG, Yang
(Postdoctoral Researcher)
- website


ASAEI, Afsaneh
(Postdoctoral Researcher)
- website


VLASENKO, Bogdan
(Postdoctoral Researcher)
- website


HONNET, Pierre-Edouard
(Postdoctoral Researcher)
- website


MADIKERI, Srikanth
(Postdoctoral Researcher)
- website


RAM, Dhananjay
(Research Assistant)
- website


MUCKENHIRN, Hannah
(Research Assistant)
- website


HE, Weipeng
(Research Assistant)
- website


TONG, Sibo
(Research Assistant)
- website


DEY, Subhadeep
(Research Assistant)
- website


TORNAY, Sandrine
(Research Assistant)
- website


DIGHE, Pranay
(Research Assistant)
- website


SCHNELL, Bastian
(Research Assistant)
- website


RAZAVI, Marzieh
(Research Assistant)
- website


PRASAD, Amrutha
(Trainee)
- website


Alumni

  • AJMERA, Jitendra
  • ARADILLA ZAPATA, Guillermo
  • ATHINEOS, Marios
  • BAHAADINI, Sara
  • BARBER, David
  • BENZEGHIBA, Mohamed (Faouzi)
  • CEREKOVIC, Aleksandra
  • CEVHER, Volkan
  • CHAVARRIAGA, Ricardo
  • COLLADO, Thierry
  • CRITTIN, Frank
  • DINES, John
  • DRYGAJLO, Andrzej
  • DUFFNER, Stefan
  • GALAN MOLES, Ferran
  • GRANDVALET, Yves
  • GRANGIER, David
  • HAGEN, Astrid
  • HERMANSKY, Hynek
  • IKBAL, Shajith
  • IVANOVA, Maria
  • KETABDAR, Hamed
  • KRSTULOVIC, Sacha
  • LATHOUD, Guillaume
  • LI, Weifeng
  • MARIÉTHOZ, Johnny
  • MARTINS, Renato
  • MASSON, Olivier
  • MCCOWAN, Iain
  • MILLÁN, José del R.
  • MOORE, Darren
  • MORRIS, Andrew
  • MOSTAANI, Zohreh
  • MOULIN, François
  • NATUREL, Xavier
  • PARTHASARATHI, Sree Hari Krishnan
  • PINTO, Francisco
  • POTARD, Blaise
  • SHANKAR, Ravi
  • STEPHENSON, Todd
  • SZASZAK, György
  • TYAGI, Vivek
  • ULLMANN, Raphael
  • VALENTE, Fabio
  • WELLNER, Pierre

Current Projects

Recent Projects

Group News

Idiap Speaker Series : 'Multilingual speech recognition in under-resourced environments' (Webcast now available).
research — Jun 19, 2017

When speech processing systems are designed for use in multilingual environments, additional complexity is introduced. Identifying when language switching has occurred, predicting how cross-lingual terms will be pronounced, obtaining sufficient speech data from diverse language backgrounds: such factors all complicate the development of practical speech-oriented systems. In this talk, I will discuss our research group's experience in building speech recognition systems for the South African environment, one in which 11 official languages are recognised. I will also show how this relates to our participation in the BABEL project, a recent 5-year international collaborative project aimed at solving the spoken term detection task in under-resourced languages.

Idiap has a new opening for a Post-doctoral position in automatic speech recognition
education — Nov 08, 2016

The Idiap Research Institute invites applications for post-doctoral position in automatic speech recognition. The position is funded by a new industrial project with a leading credit card company in Switzerland. The research and development project will focus on combining technologies of speech recognition with speaker verification. The research will be carried out in a collaboration with other (i.e. European H2020) projects already running at the Idiap research institute.

Machines learn to speak Swiss-German
research — Feb 29, 2016

With regard to devices using voice control, the Swiss German population has so far been left out in the cold. At the best, smartphones, smart TVs and other tools of this kind understand High German, but have no chance when the Swiss German dialect is concerned. But this will change soon.

La recherche évolue de manière très rapide.
institute — Jul 23, 2013

Le déferlement d’outils technologiques accélère la communication et bouleverse notre rapport au monde. Basé à Martigny, l’Idiap – qui mène des projets de recherche fondamentale au plus haut niveau – travaille à l’amélioration des relations personne-machine et à l’optimisation de la communication humaine. Ce prestigieux institut s’engage pour un progrès scientifique au service de l’homme. Interview de son directeur, Hervé Bourlard, expert mondial du traitement de la parole et également professeur à l’Ecole polytechnique fédérale de Lausanne (EPFL).

Multimodal Signal Processing: Human Interactions in Meetings
research — Jun 14, 2012

A new book on multimodal signal processing for the analysis of human communication has been published by Cambridge University Press on June 7, 2012. The book was edited by Hervé Bourlard and Andrei Popescu-Belis, with colleagues from the University of Edinburgh, and five other Idiap researchers have contributed chapters to it.

Multimodal Signal Processing: Human Interactions in Meetings
research — Jun 14, 2012

A new book on multimodal signal processing for the analysis of human communication has been published by Cambridge University Press on June 7, 2012. The book was edited by Hervé Bourlard and Andrei Popescu-Belis, with colleagues from the University of Edinburgh, and five other Idiap researchers have contributed chapters to it.

New Book on Multimodal signal processing
research — Oct 29, 2009

Multimodal signal processing: methods and techniques to build multimodal interactive systems by Jean-Philippe Thiran (Author), Herve Bourlard (Author), Ferran Marques (Author), Academic Press Inc (23 novembre 2009), 448 pages, ISBN-10: 0123748259