Geneemo: An Expressive Audio Content Generation Tool

The doctoral dissertation of Lakshmi Saheer was focussed on rapid speaker adaptation in a speech-to-speech translation system using Vocal tract length normalization (VTLN). This was the first time in the history of speech synthesis to use to a feature transformation to alter output speaker characteristics. Following the similar lines, it was shown later that changing the speech excitation and source parameters will also modify the emotions and expressions in the speech. Based on these results, we propose the idea for a startup (Geneemo) based on the technology for adding emotions, expressions, special effects and characterization to any speech. This should add value to any voice based application in the market ranging from audio books, gaming applications, GPS systems to speech generating machines used by people with speaking handicaps. Based on a short market survey, we found that the audio book market is the best bet to cater this technology since the current audio books are generated by highly paid actors. Due to this only the best sellers or top 10% of the book titles are converted to audio books. The Geneemo audio book would reduce the cost of audio book generation by 60% whilst generating the audio that sounds like movie sound track with emotions, expressions, multiple character and background scores.
Application Area - Entertainment and games, Information Interfaces and Presentation
Idiap Research Institute
Hasler Stiftung (Hasler Foundation)
Apr 01, 2014
Sep 30, 2015