Show simple item record

dc.contributor.authorGiwa, Oluwapelumi
dc.contributor.authorDavel, Marelie H.
dc.date.accessioned2018-03-02T13:03:23Z
dc.date.available2018-03-02T13:03:23Z
dc.date.issued2015
dc.identifier.urihttp://ieeexplore.ieee.org/document/7359517/
dc.identifier.urihttps://www.semanticscholar.org/paper/Text-based-language-identification-of-multilingual-Giwa-Davel/f4ed2100a8423a2621d9a4e22f18899eedd33c98
dc.identifier.urihttp://hdl.handle.net/10394/26489
dc.description.abstractText-based language identification (T-LID) of isolated words has been shown to be useful for various speech processing tasks, including pronunciation modelling and data categorisation. When the words to be categorised are proper names, the task becomes more difficult: not only do proper names often have idiosyncratic spellings, they are also often considered to be multilingual. We, therefore, investigate how an existing T-LID technique can be adapted to perform multilingual word classification. That is, given a proper name, which may be either mono- or multilingual, we aim to determine how accurately we can predict how many possible source languages the word has, and what they are. Using a Joint Sequence Modelbased approach to T-LID and the SADE corpus – a newly developed proper names corpus of South African names – we experiment with different approaches to multilingual T-LID. We compare posterior-based and likelihood-based methods and obtain promising results on a challenging task.en_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.subjectText-based language identificationen_US
dc.subjectLanguage Identificationen_US
dc.subjectMultilingual Namesen_US
dc.subjectPronunciation modellingen_US
dc.titleText-based Language Identification of Multilingual Namesen_US
dc.typePresentationen_US
dc.contributor.researchID23607955 - Davel, Marelie Hattingh


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record