Interpolation of Austrian German and Viennese Dialect/Sociolect in HMM-based Speech Synthesis
Here’s its abstract:
In contrast to widely used waveform concatenation methods, the presented approach to speech synthesis relies on a parametric analysis–re-synthesis technique, where the features extracted in the analysis stage are modeled by hidden Markov models (HMMs). Many important improvements in the last decade have helped this approach to reach impressive performance. Additionally, its inherent ﬂexibility makes it suitable for advanced speech synthesis tasks, like speaker adaptation, speaker interpolation, emotional speech, etc. In this work, a ﬂexible multi-dialect HMM-based speech synthesis system for Austrian German and Viennese dialect/sociolect is presented. A novel contribution is the interpolation of dialects, where we have to deal with phonological processes that change the segmental structure of the utterance. Evaluation results show that listeners do perceive both continuous and categorical changes of varieties.
The Austrian Computer Society (OCG) gives out an award for the best computer science thesis every year. I was lucky enough to win this award.
- OCG press release (in German) (link now defunct)
- OCG Journal entry (in German) (link now defunct)
- FTW news entry (link now defunct)
- Faculty of Informatics at TU Wien news entry (in German)
My work on this project also contributed to a journal paper in Speech Communication:
Michael Pucher, Dietmar Schabus, Junichi Yamagishi, Friedrich Neubarth and Volker Strom
Modeling and interpolation of Austrian German and Viennese dialect in HMM-based speech synthesis
Speech Communication, Volume 52, Issue 2, February 2010, Pages 164-179
this paper on ScienceDirect
Dr. Yamagishi also posted a news entry on his website about the award I won.
You can listen to some interpolation examples here. (link now defunct)