Modeling and interpolation of Austrian German and Viennese dialect in HMM-based speech synthesis

by Michael Pucher, Dietmar Schabus, Junichi Yamagishi, Friedrich Neubarth, Volker Strom
Abstract:
An HMM-based speech synthesis framework is applied to both standard Austrian German and a Viennese dialectal variety and several training strategies for multi-dialect modeling such as dialect clustering and dialect-adaptive training are investigated. For bridging the gap between processing on the level of HMMs and on the linguistic level, we add phonological transformations to the HMM interpolation and apply them to dialect interpolation. The crucial steps are to employ several formalized phonological rules between Austrian German and Viennese dialect as constraints for the HMM interpolation. We verify the effectiveness of this strategy in a number of perceptual evaluations. Since the HMM space used is not articulatory but acoustic space, there are some variations in evaluation results between the phonological rules. However, in general we obtained good evaluation results which show that listeners can perceive both continuous and categorical changes of dialect varieties by using phonological transformations employed as switching rules in the HMM interpolation.
Reference:
Michael Pucher, Dietmar Schabus, Junichi Yamagishi, Friedrich Neubarth, Volker Strom, “Modeling and interpolation of Austrian German and Viennese dialect in HMM-based speech synthesis”, In Speech Communication, vol. 52, no. 2, pp. 164-179, 2010.
Bibtex Entry:
@Article{Pucher2010,
  Title                    = {Modeling and interpolation of Austrian German and Viennese dialect in HMM-based speech synthesis},
  Author                   = {Pucher, Michael and Schabus, Dietmar and Yamagishi, Junichi and Neubarth, Friedrich and Strom, Volker},
  Journal                  = {Speech Communication},
  Year                     = {2010},

  Month                    = feb,
  Number                   = {2},
  Pages                    = {164-179},
  Volume                   = {52},

  Abstract                 = {An HMM-based speech synthesis framework is applied to both standard Austrian German and a Viennese dialectal variety and several training strategies for multi-dialect modeling such as dialect clustering and dialect-adaptive training are investigated. For bridging the gap between processing on the level of HMMs and on the linguistic level, we add phonological transformations to the HMM interpolation and apply them to dialect interpolation. The crucial steps are to employ several formalized phonological rules between Austrian German and Viennese dialect as constraints for the HMM interpolation. We verify the effectiveness of this strategy in a number of perceptual evaluations. Since the HMM space used is not articulatory but acoustic space, there are some variations in evaluation results between the phonological rules. However, in general we obtained good evaluation results which show that listeners can perceive both continuous and categorical changes of dialect varieties by using phonological transformations employed as switching rules in the HMM interpolation.},
  Doi                      = {10.1016/j.specom.2009.09.004},
  ISSN                     = {01676393},
  Keywords                 = {austrian german,dialect,hidden markov model,sociolect,speech synthesis},
}