Effective Multilingual Interaction in Mobile Environments
The EMIME project will help to overcome the language barrier by developing a mobile device that performs personalised speech-to-speech translation, such that the user’s spoken input in one language is used to produce spoken output in another language, while continuing to sound like the user’s voice.
Personalisation of systems for cross-lingual spoken communication is an important, but little explored, topic. It is essential for providing more natural interaction and making the computing device a less obtrusive element when assisting human-human interactions.
We will build on recent developments in speech synthesis using hidden Markov models, which is the same technology used for automatic speech recognition. Using a common statistical modelling framework for automatic speech recognition and speech synthesis will enable the use of common techniques for adaptation and multilinguality.
Significant progress will be made towards a unified approach for speech recognition and speech synthesis: this is a very powerful concept, and will open up many new areas of research. In this project, we will explore the use of speaker adaptation across languages so that, by performing automatic speech recognition, we can learn the characteristics of an individual speaker, and then use those characteristics when producing output speech in another language.
