Word-Level vs Sentence-Level Language Identification: Application to Algerian and Arabic Dialects | Zendy

Mohamed Lichouri | Zendy; Mourad Abbas | Zendy; Abed Alhakim Freihat | Zendy; Dhiya El Hak Megtouf | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Word-Level vs Sentence-Level Language Identification: Application to Algerian and Arabic Dialects

Author(s) -

Mohamed Lichouri,

Mourad Abbas,

Abed Alhakim Freihat,

Dhiya El Hak Megtouf

Publication year - 2018

Publication title -

procedia computer science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.334

H-Index - 76

ISSN - 1877-0509

DOI - 10.1016/j.procs.2018.10.484

Subject(s) - computer science , natural language processing , artificial intelligence , sentence , arabic , identification (biology) , word (group theory) , naive bayes classifier , support vector machine , set (abstract data type) , speech recognition , linguistics , philosophy , botany , biology , programming language

In this paper, we investigate a set of methods for textual Arabic Dialect Identification, where we considered word-level and sentence-level approaches. We used three classifiers, namely: Linear Support Vector Machine L-SVM, Bernoulli Naive Bayes BNB and Multinomial Naive Bayes MNB. Then we combined them by using a voting procedure. We carried out experiments on two sets of dialects: the first one, PADIC, which consists of parallel sentences in Maghrebi and Middle Eastern dialects; and the second, a set of Algerian dialects only, that we built manually. For the Arabic dialects, we obtained an average accuracy of 92%. For Algerian dialects, our approach yielded an average accuracy of about 76%.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research