dc.contributor.author | Van Heerden, Charl J. | |
dc.contributor.author | Barnard, Etienne | |
dc.contributor.author | Davel, Marelie H. | |
dc.date.accessioned | 2015-03-31T06:58:10Z | |
dc.date.available | 2015-03-31T06:58:10Z | |
dc.date.issued | 2012 | |
dc.identifier.citation | Van Heerden, C.J. & Davel, M.H., et al. 2012. Medium-vocabulary speech recognition for under-resourced languages. In: International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU), Cape Town, South Africa, 7-9 May 2012. [http://www.mica.edu.vn/sltu2012/files/proceedings/26.pdf] | en_US |
dc.identifier.issn | 978-1-86822-615-3 | |
dc.identifier.uri | http://hdl.handle.net/10394/13632 | |
dc.identifier.uri | http://www.mica.edu.vn/sltu2012/files/proceedings/26.pdf | |
dc.description | International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU), Cape Town, South Africa, 7-9 May 2012 | en_US |
dc.description.abstract | We report on the development of speech-recognition systems that are able to perform accurate recognition on mediumvocabulary tasks (i.e. tasks that require distinctions between approximately 200 different terms). We are able to achieve error rates of less than 5% (our design goal) on four underresourced languages as well as English, by using training corpora that contain 70–100 hours of speech per language. The majority of the errors stem from words such as abbreviations, foreign words or names, which do not adhere to the standard orthography of the target language. We also find that recognition accuracy does not depend strongly on the number of occurrences of a term in the training set or the length of the term to be recognized, and that a few problematic speakers are responsible for a disproportionate number of errors. | en_US |
dc.language.iso | en | en_US |
dc.publisher | SLTU | en_US |
dc.subject | Speech recognition | en_US |
dc.subject | Under-resourced languages | en_US |
dc.subject | Multilingual speech processing | en_US |
dc.title | Medium-vocabulary speech recognition for under-resourced languages | en_US |
dc.type | Other | en_US |
dc.contributor.researchID | 11539151 - Van Heerden, Carel Jacobus | |
dc.contributor.researchID | 23607955 - Davel, Marelie Hattingh | |
dc.contributor.researchID | 21021287 - Barnard, Etienne | |