dc.contributor.author | de Vries, Nic J. | |
dc.contributor.author | Badenhorst, Jaco | |
dc.contributor.author | Davel, Marelie H. | |
dc.contributor.author | Barnard, Etienne | |
dc.contributor.author | de Waal, Alta | |
dc.date.accessioned | 2018-03-07T07:47:27Z | |
dc.date.available | 2018-03-07T07:47:27Z | |
dc.date.issued | 2011 | |
dc.identifier.citation | Nic J De Vries, Jaco Badenhorst, Marelie H Davel, Etienne Barnard and Alta de Waal, “Woefzela - an open-source platform for ASR data collection in the developing world”, in Proc. Interspeech, pp 3177-3180, Florence, Italy, 2011. [http://engineering.nwu.ac.za/multilingual-speech-technologies-must/publications] | en_US |
dc.identifier.uri | https://researchspace.csir.co.za/dspace/bitstream/handle/10204/5149/de%20Vries_2011.pdf?sequence=1&isAllowed=y | |
dc.identifier.uri | https://pdfs.semanticscholar.org/0c4c/bfd1ac75240666a2c40e97f3e171906aebdb.pdf | |
dc.identifier.uri | http://hdl.handle.net/10394/26542 | |
dc.description.abstract | Building transcribed speech corpora for under-resourced
languages plays a pivotal role in developing speech technologies
for such languages. We have developed an open-source
tool for devices running the Android operating system to facilitate
the efficient collection of speech data for Automatic Speech
Recognition system development. The tool was designed for
use in typical developing-world conditions; we present the relevant
design choices and analyse the effectiveness of this tool
by means of a case study. In particular, we introduce a novel
semi-real-time quality monitoring system, which increases the
efficiency of the data collection process. | en_US |
dc.description.sponsorship | This project was made possible through the support of the South
African National Centre for Human Language Technology, an
initiative of the South African Department of Arts and Culture.
The authors would also like to thank Pedro Moreno, Thad
Hughes and Ravindran Rajakumar of Google Research for valuable
inputs at various stages of this work. | en_US |
dc.language.iso | en | en_US |
dc.publisher | Interspeech 2011 | en_US |
dc.subject | Speech resource collection | en_US |
dc.subject | Automatic speech recognition | en_US |
dc.subject | Developing world | en_US |
dc.subject | Resource-scarce environment | en_US |
dc.subject | Under-resourced languages | en_US |
dc.subject | Android | en_US |
dc.title | Woefzela - an open-source platform for ASR data collection in the developing world | en_US |
dc.type | Presentation | en_US |