Learning to transcribe Swiss German dialects

Improving Swiss German speech recognition for Swisscom TV Box Voice Assistant

In Switzerland, Standard German (the one spoken across Germany) stands in a diglossic relationship with Swiss German dialects, the varieties of everyday communication throughout the German-speaking cantons. Swiss Standard German (referred to as Schweizer Schriftdeutsch) is a variety of Standard German and the written form of the official German spoken in Switzerland. It is used in books, newspapers, and all official publications; however, Swiss Standard German is not typically spoken. Writing in Swiss German has only arisen rather recently (notably in text messaging) and as a result, there are no orthographic conventions for Swiss German varieties.

Swiss Standard German is different from Standard German on all levels of linguistic analysis (called Helvetisms) including vocabulary, pronunciation, orthography, and even syntax. Differences at these levels also exist among the various Swiss German dialects. Fortunately, Swiss German is the best researched dialect area in Central Europe. The Dieth spelling system (a system of phonetic transcription) is used in most scientific accounts for writing Swiss German dialects (referred to as GSW). It uses Standard German spelling as a starting point but deviates where it is inconsistent or lacks the precision needed for the description of the various Swiss dialects. Dialectal variation causes lexical units to be pronounced, and therefore written, differently in different regions. For instant, in Standard German ‘gehst mir bitte auf das wetter’ (‘Tell me the weather please’) can be written in Swiss German as ‘geisch mer bitte uf ds wätter’ or gaischt mer bitte uff daas wetter’ in different regions, where geisch/gaischtuf/uffds/daas and wätter are variations of Standard German words gehstaufdas and wetter, respectively. To establish the lexical identity of all writing variants, they need to be normalized to a single form.

Automatic speech recognition (ASR) of Swiss German is a considerable challenge owing to the lack of transcribed datasets and the considerable regional variation described above. We at Idiap, in collaboration with the Swisscom AI group, designed a multi-dialectal approach to word generation for Swiss German to handle the linguistic variation of words in various dialect landscapes. Given a Swiss German word form and a set of handcrafted written variants, we learn a model to automatically generate inflected word candidates for unseen words or alternative forms for known words in different Swiss German dialects. This helps us build a large dialectal lexicon to establish an identity across word variants. Together with an acoustic and language model, we are able to provide a normalized transcription of Swiss German. Our experiments on Swiss German multi-dialect data extracted from Swisscom TV box voice assistant, indicates a significant improvement in ASR performance, especially for dialects with no or little linguistic information.