Interpolation of Austrian German and Viennese Dialect/Sociolect in HMM-based Speech Synthesis

by Dietmar Schabus
Abstract:
In contrast to widely used waveform concatenation methods, the presented approach to speech synthesis relies on a parametric analysis–re-synthesis technique, where the features extracted in the analysis stage are modeled by hidden Markov models (HMMs). Many important improvements in the last decade have helped this approach to reach impressive performance. Additionally, its inherent flexibility makes it suitable for advanced speech synthesis tasks, like speaker adaptation, speaker interpolation, emotional speech, etc. In this work, a flexible multi-dialect HMM-based speech synthesis system for Austrian German and Viennese dialect/sociolect is presented. A novel contribution is the interpolation of dialects, where we have to deal with phonological processes that change the segmental structure of the utterance. Evaluation results show that listeners do perceive both continuous and categorical changes of varieties.
Reference:
Dietmar Schabus, “Interpolation of Austrian German and Viennese Dialect/Sociolect in HMM-based Speech Synthesis”, Master’s thesis, Vienna University of Technology, Vienna, Austria, 2009.
Bibtex Entry:
@MastersThesis{Schabus2009,
  Title                    = {Interpolation of Austrian German and Viennese Dialect/Sociolect in {HMM}-based Speech Synthesis},
  Author                   = {Schabus, Dietmar},
  School                   = {Vienna University of Technology},
  Year                     = {2009},

  Address                  = {Vienna, Austria},
  Month                    = apr,

  Abstract                 = {In contrast to widely used waveform concatenation methods, the presented approach to speech synthesis relies on a parametric analysis–re-synthesis technique, where the features extracted in the analysis stage are modeled by hidden Markov models (HMMs). Many important improvements in the last decade have helped this approach to reach impressive performance. Additionally, its inherent flexibility makes it suitable for advanced speech synthesis tasks, like speaker adaptation, speaker interpolation, emotional speech, etc. In this work, a flexible multi-dialect HMM-based speech synthesis system for Austrian German and Viennese dialect/sociolect is presented. A novel contribution is the interpolation of dialects, where we have to deal with phonological processes that change the segmental structure of the utterance. Evaluation results show that listeners do perceive both continuous and categorical changes of varieties.},
}