Now showing items 1-2 of 2

    • Collecting and evaluating speech recognition corpora for 11 South African languages 

      Badenhorst, Jaco; Van Heerden, Charl; Barnard, Etienne; Davel, Marelie H. (Springer, 2011)
      We describe the Lwazi corpus for automatic speech recognition (ASR), a new telephone speech corpus which contains data from the eleven official languages of South Africa. Because of practical constraints, the amount of ...
    • The NCHLT Speech Corpus of the South African languages 

      Barnard, Etienne; Davel, Marelie H.; van Heerden, Charl; De Wet, Febe; Badenhorst, Jaco (Workshop Spoken Language Technologies for Under-resourced Languages (SLTU), 2014)
      The NCHLT speech corpus contains wide-band speech from approximately 200 speakers per language, in each of the eleven official languages of South Africa. We describe the design and development processes that were ...