Reëlgebaseerde klemtoontoekenning in 'n grafeem-na-foneemstelsel vir Afrikaans
Mouton, Elsie Wilhelmina
MetadataShow full item record
Text -to-speech systems currently are of great importance in the community. One core technology in this human language technology resource is stress assignment which plays an important role in any text-to-speech system. At present no automatic stress assigner for Afrikaans exists. For these reasons, the two most important aims of this project will be: a) to develop a complete and accurate set of stress rules for Afrikaans that can be implemented in an automatic stress assigner, and b) to develop an effective and highly accurate stress assigner in order to assign Afrikaans stress to words quickly and effectively. A set of stress rules for Afrikaans was developed in order to reach the first goal. It consists of 18 rules that are divided into groups for words that contain a schwa, derivations, and disyllabic, tri-syllabic and polysyllabic simplex words. Next, different approaches that can be used to develop a stress assigner were examined, and the rule-based approach was used to implement the developed stress rules within the stress assigner. The programming language, Perl, was chosen for the implementation of the rules. The chosen algorithm was used to generate a stress assigner for Afrikaans by implementing the stress rules developed. The hyphenator, Calomo and the compound analyser, CKarma was used to hyphenate all the test data and detect word boundaries within compounds. A dataset of 10 000 correctly annotated tokens was developed during the testing process. The evaluation of the stress assigner consists of four phases. During the first phase, the stress assigner was evaluated with the 10 000 tokens and achieved an accuracy of 92.09%. The grapheme - to-phoneme converter was evaluated with the same data and scored 91.9%. The influence of various factors on stress assignment was determined, and it was established that stress assignment is an essential component of rule-based grapheme-to-phoneme conversion. In conclusion, it can be said that the stress assigner achieved satisfactory results, and that the stress assigner can be successfully utilized in future projects to develop training data for further experiments with stress assignment and grapheme-to-phoneme conversion for Afrikaans. Experiments can be conducted in future with data-driven approaches that possibly may lead to better results in Afrikaans stress assignment and grapheme-to-phoneme conversion.
- Humanities