The Development of a Sepedi Text Generation Model Using Transformers
Abstract
Text generation is one of the important sub-tasks
of natural language generation (NLG), and aims to produce
humanly readable text given some input text. Deep learning
approaches based on neural networks have been proposed to
solve text generation tasks. Although these models can generate
text, they do not necessarily capture long-term dependencies
accurately, making it difficult to coherently generate longer
sentences. Transformer-based models have shown significant
improvement in text generation. However, these models are
computationally expensive and data hungry. In this study, we
develop a Sepedi text generation model using a Transformer based approach and explore its performance. The developed
model has one Transformer block with causal masking on the
attention layers and two separate embedding layers. To train
the model, we use the National Centre for Human Language
Technology (NCHLT) Sepedi text corpus. Our experimental
setup varied the model embedding size, batch size and the
sequence length. The final model was able to reconstruct unseen
test data with 75% accuracy: the highest accuracy achieved to
date, using a Sepedi corpus.
Collections
- Faculty of Engineering [1129]