Now showing items 1-4 of 4

    • Collecting and evaluating speech recognition corpora for 11 South African languages 

      Badenhorst, Jaco; Van Heerden, Charl; Barnard, Etienne; Davel, Marelie H. (Springer, 2011)
      We describe the Lwazi corpus for automatic speech recognition (ASR), a new telephone speech corpus which contains data from the eleven official languages of South Africa. Because of practical constraints, the amount of ...
    • Improving the Lwazi ASR baseline 

      van Heerden, Charl; Kleynhans, Neil; Davel, Marelie H. (Interspeech 2016, 2016)
      We investigate the impact of recent advances in speech recognition techniques for under-resourced languages. Specifically, we review earlier results published on the Lwazi ASR corpus of South African languages, and ...
    • The NCHLT Speech Corpus of the South African languages 

      Barnard, Etienne; Davel, Marelie H.; van Heerden, Charl; De Wet, Febe; Badenhorst, Jaco (Workshop Spoken Language Technologies for Under-resourced Languages (SLTU), 2014)
      The NCHLT speech corpus contains wide-band speech from approximately 200 speakers per language, in each of the eleven official languages of South Africa. We describe the design and development processes that were ...
    • The predictability of name pronunciation errors in four South African languages 

      Kgampe, Mpho; Davel, Marelie H. (Pattern Recognition Association of South Africa and Mechatronics International Conference, 2011)
      Personal names are often pronounced in very different ways depending on the language background of the speaker. We seek to determine whether some of these pronunciations 'errors' are systematic and if so, in which ways. ...