Woefzela - an open-source platform for ASR data collection in the developing world
View/ Open
Date
2011Author
de Vries, Nic J.
Badenhorst, Jaco
Davel, Marelie H.
Barnard, Etienne
de Waal, Alta
Metadata
Show full item recordAbstract
Building transcribed speech corpora for under-resourced
languages plays a pivotal role in developing speech technologies
for such languages. We have developed an open-source
tool for devices running the Android operating system to facilitate
the efficient collection of speech data for Automatic Speech
Recognition system development. The tool was designed for
use in typical developing-world conditions; we present the relevant
design choices and analyse the effectiveness of this tool
by means of a case study. In particular, we introduce a novel
semi-real-time quality monitoring system, which increases the
efficiency of the data collection process.
URI
https://researchspace.csir.co.za/dspace/bitstream/handle/10204/5149/de%20Vries_2011.pdf?sequence=1&isAllowed=yhttps://pdfs.semanticscholar.org/0c4c/bfd1ac75240666a2c40e97f3e171906aebdb.pdf
http://hdl.handle.net/10394/26542
Collections
- Faculty of Engineering [1123]