Now showing items 1-7 of 7

    • Automatic alignment of audiobooks in Afrikaans 

      Van Heerden, Carel J.; De Wet, Febe; Davel, Marelie H. (Pattern recognition association of South Africa (PRASA), 2012)
      This paper reports on the automatic alignment of audiobooks in Afrikaans. An existing Afrikaans pronunciation dictionary and corpus of Afrikaans speech data are used to generate baseline acoustic models. The baseline system ...
    • Collecting and evaluating speech recognition corpora for 11 South African languages 

      Badenhorst, Jaco; Van Heerden, Charl; Barnard, Etienne; Davel, Marelie H. (Springer, 2011)
      We describe the Lwazi corpus for automatic speech recognition (ASR), a new telephone speech corpus which contains data from the eleven official languages of South Africa. Because of practical constraints, the amount of ...
    • Medium-vocabulary speech recognition for under-resourced languages 

      Van Heerden, Charl J.; Barnard, Etienne; Davel, Marelie H. (SLTU, 2012)
      We report on the development of speech-recognition systems that are able to perform accurate recognition on mediumvocabulary tasks (i.e. tasks that require distinctions between approximately 200 different terms). We are ...
    • The semi-automated creation of stratified speech corpora 

      Van Heerden, Carel; Barnard, Etienne; Davel, Marelie H. (Pattern recognition association of South Africa (PRASA), 2013)
      Smartphones provide an efficient means for the collection of speech data; however, the quality of the corpora created in this fashion is not predictable. We describe an approach that allows us to post-process and rank ...
    • Semi-supervised training for lecture transcription in resource-scarce environments 

      De Villiers, Pieter; Barnard, Etienne; Van Heerden, Charl (PRASA, 2014)
      We present a study where standard semi-supervised training methods are applied in a resource-scarce environment to build lecture transcription systems. Experiments are conducted on two different corpora which one can expect ...
    • Towards lecture transcription in resource-scarce environments 

      De Villiers, Pieter; Jooste, Petri; Van Heerden, Carel J.; Barnard, Etienne (Pattern recognition association of South Africa (PRASA), 2012)
      We present progress towards automated Lecture Transcription (LT) in resource scarce environments. Our development has focused on the transcription of lectures in Afrikaans from two faculties at North-West University. A ...
    • Validating smartphone-collected speech corpora 

      Van Heerden, Carel J.; Barnard, Etienne; Davel, Marelie H. (SLTU, 2012)
      We investigate the effectiveness with which the accuracy of a prompted speech corpus can be validated when minimal additional speech resources are available, and specifically when a language model in the target language ...