Implementing a distributed approach for speech resource and system development

Molapo, Nkadimeng Raymond

dc.contributor.author	Molapo, Nkadimeng Raymond
dc.date.accessioned	2016-01-19T09:59:37Z
dc.date.available	2016-01-19T09:59:37Z
dc.date.issued	2014
dc.identifier.uri	http://hdl.handle.net/10394/15922
dc.description	MIng (Computer and Electronic Engineering), North-West University, Potchefstroom Campus, 2014	en_US
dc.description.abstract	The range of applications for high-quality automatic speech recognition (ASR) systems has grown dramatically with the advent of smart phones, in which speech recognition can greatly enhance the user experience. Currently, the languages with extensive ASR support on these devices are languages that have thousands of hours of transcribed speech corpora already collected. Developing a speech system for such a language is made simpler because extensive resources already exist. However for languages that are not as prominent, the process is more difficult. Many obstacles such as reliability and cost have hampered progress in this regard, and various separate tools for every stage of the development process have been developed to overcome these difficulties. Developing a system that is able to combine these identified partial solutions, involves customising existing tools and developing new ones to interface the overall end-to-end process. This work documents the integration of several tools to enable the end-to-end development of an Automatic Speech Recognition system in a typical under-resourced language. Google App Engine is employed as the core environment for data verification, storage and distribution, and used in conjunction with existing tools for gathering text data and for speech data recording. We analyse the data acquired by each of the tools and develop an ASR system in Shona, an important under-resourced language of Southern Africa. Although unexpected logistical problems complicated the process, we were able to collect a useable Shona speech corpus, and develop the first Automatic Speech Recognition system in that language.	en_US
dc.language.iso	en	en_US
dc.subject	Automatic speech recognition	en_US
dc.subject	Smart phones	en_US
dc.subject	Transcribed speech corpora	en_US
dc.subject	Under-resourced language	en_US
dc.subject	Google App Engine	en_US
dc.subject	Data verification	en_US
dc.subject	Shona	en_US
dc.title	Implementing a distributed approach for speech resource and system development	en
dc.type	Thesis	en_US
dc.description.thesistype	Masters	en_US

Files in this item

Name:: Molapo_NR_2014.pdf
Size:: 888.7Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Engineering [1403]

Show simple item record