Show simple item record

dc.contributor.authorVan Heerden, Carel J.
dc.contributor.authorBarnard, Etienne
dc.contributor.authorDavel, Marelie H.
dc.identifier.citationDavel, M.H. & Van Heerden, C.J., et al. 2012. Validating smartphone-collected speech corpora. In: International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU), Cape Town, South Africa, 7-9 May 2012.en_US
dc.descriptionInternational Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU), Cape Town, South Africa, 7-9 May 2012en_US
dc.description.abstractWe investigate the effectiveness with which the accuracy of a prompted speech corpus can be validated when minimal additional speech resources are available, and specifically when a language model in the target language is not available. We compare a word-based variant of Goodness of Pronunciation (GOP) with a phone-based dynamic programming (PDP) scoring technique. The first technique uses the acoustic likelihood ratio and the second the optimal alignment between an observed phone string (generated by a speech recogniser) and a reference phone string (obtained from a dictionary) to generate validation scores. We define a new technique to obtain a PDP scoring matrix in a data-driven fashion, examine different ways of using GOP for word scoring, and find that variants of both techniques provide results that are effective for corpus validation.en_US
dc.subjectSpeech corporaen_US
dc.subjectCorpus validationen_US
dc.subjectGoodness of pronunciationen_US
dc.subjectPhone-based dynamic programming scoresen_US
dc.titleValidating smartphone-collected speech corporaen_US
dc.contributor.researchID23607955 - Davel, Marelie Hattingh
dc.contributor.researchID11539151 - Van Heerden, Carel Jacobus
dc.contributor.researchID21021287 - Barnard, Etienne

Files in this item


There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record