Master’s Thesis

For my master’s thesis at FTW and TU Wien I worked on hidden Markov model based speech synthesis. I finished it (and thus my computer science master’s) in 2009. The title of the thesis is:

Interpolation of Austrian German and Viennese Dialect/Sociolect in HMM-based Speech Synthesis

Here’s its abstract:

In contrast to widely used waveform concatenation methods, the presented approach to speech synthesis relies on a parametric analysis–re-synthesis technique, where the features extracted in the analysis stage are modeled by hidden Markov models (HMMs). Many important improvements in the last decade have helped this approach to reach impressive performance. Additionally, its inherent flexibility makes it suitable for advanced speech synthesis tasks, like speaker adaptation, speaker interpolation, emotional speech, etc. In this work, a flexible multi-dialect HMM-based speech synthesis system for Austrian German and Viennese dialect/sociolect is presented. A novel contribution is the interpolation of dialects, where we have to deal with phonological processes that change the segmental structure of the utterance. Evaluation results show that listeners do perceive both continuous and categorical changes of varieties.

Download the full text PDF here: schabus_masterthesis_final.pdf (710 downloads)  (5.4 MB). The PDF is also available from

The Austrian Computer Society (OCG) gives out an award for the best computer science thesis every year. I was lucky enough to win this award.

My work on this project also contributed to a journal paper in Speech Communication:

Michael Pucher, Dietmar Schabus, Junichi Yamagishi, Friedrich Neubarth and Volker Strom
Modeling and interpolation of Austrian German and Viennese dialect in HMM-based speech synthesis
Speech Communication, Volume 52, Issue 2, February 2010, Pages 164-179
this paper on ScienceDirect

Dr. Yamagishi also posted a news entry on his website about the award I won.

You can listen to some interpolation examples here(link now defunct)