Exploring multinomial naïve Bayes for Yorùbá text document classification | Zendy

Ikechukwu Ignatius Ayogu | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Exploring multinomial naïve Bayes for Yorùbá text document classification

Author(s) -

Ikechukwu Ignatius Ayogu

Publication year - 2020

Publication title -

nigerian journal of technology

Language(s) - English

Resource type - Journals

eISSN - 2467-8821

pISSN - 0331-8443

DOI - 10.4314/njt.v39i2.23

Subject(s) - bigram , trigram , natural language processing , computer science , artificial intelligence , naive bayes classifier , yoruba , representation (politics) , text categorization , categorization , linguistics , support vector machine , politics , philosophy , political science , law

The recent increase in the emergence of Nigerian language text online motivates this paper in which the problem of classifying text documents written in Yorùbá language into one of a few pre-designated classes is considered. Text document classification/categorization research is well established for English language and many other languages; this is not so for Nigerian languages. This paper evaluated the performance of a multinomial Naive Bayes model learned on a research dataset consisting of 100 samples of text each from business, sporting, entertainment, technology and political domains, separately on unigram, bigram and trigram features obtained using the bag of words representation approach. Results show that the performance of the model over unigram and bigram features is comparable but significantly better than a model learned on trigram features. The results generally indicate a possibility for the practical application of NB algorithm to the classification of text documents written in Yorùbá language. Keywords: Supervised learning, text classification, Yorùbá language, text mining, BoW Representation

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research