Part-of-speech effects on text-to-speech synthesis
Date
2010Author
Schlunz, Georg I.
Barnard, Etienne
van Huyssteen, Gerhard B.
Metadata
Show full item recordAbstract
One of the goals of text-to-speech (TTS) systems is to produce natural-sounding synthesized speech. Towards this end various natural language processing (NLP) tasks are performed to model the prosodic aspects of the TTS voice. One of the fundamental NLP tasks being used is the part-of-speech (POS) tagging of the words in the text. This paper investigates the effects of POS information on the naturalness of a hidden Markov model (HMM) based TTS voice when additional resources are not available to aid in the modeling of prosody. It is found that, when a minimal feature set is used for the HMM context labels, the addition of POS tags does improve the naturalness of the voice. However, the same effect can be accomplished by including segmental counting and positional information instead of the POS tags.
URI
https://researchspace.csir.co.za/dspace/bitstream/handle/10204/4674/Schlunz_2010.pdf?sequence=1&isAllowed=yhttp://hdl.handle.net/10394/26555
Collections
- Faculty of Engineering [1123]
Related items
Showing items related by title, author, creator and subject.
-
Automatic speech segmentation with limited data
Van Niekerk, Daniel Rudolph (North-West University, 2009)The rapid development of corpus-based speech systems such as concatenative synthesis systems for under-resourced languages requires an efficient, consistent and accurate solution with regard to phonetic speech segmentation. ... -
A smartphone-based ASR data collection tool for under-resourced languages
De Vries, Nic J.; Badenhorst, Jaco; Basson, Willem D.; De Wet, Febe; Barnard, Etienne; De Waal, Alta; Davel, Marelie H. (Elsevier, 2014)Acoustic data collection for automatic speech recognition (ASR) purposes is a particularly challenging task when working with under-resourced languages, many of which are found in the developing world. We provide a brief ... -
Effective automatic speech recognition data collection for under–resourced languages
De Vries, Nicolaas Johannes (North-West University, 2011)As building transcribed speech corpora for under–resourced languages plays a pivotal role in developing automatic speech recognition (ASR) technologies for such languages, a key step in developing these technologies is the ...