Towards Accurate Detection of Offensive Language in Online Communication in Arabic | Zendy

Azalden Alakrot | Zendy; Liam Murray | Zendy; Nikola S. Nikolov | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Towards Accurate Detection of Offensive Language in Online Communication in Arabic

Author(s) -

Azalden Alakrot,

Liam Murray,

Nikola S. Nikolov

Publication year - 2018

Publication title -

procedia computer science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.334

H-Index - 76

ISSN - 1877-0509

DOI - 10.1016/j.procs.2018.10.491

Subject(s) - computer science , offensive , arabic , classifier (uml) , natural language processing , artificial intelligence , support vector machine , variety (cybernetics) , speech recognition , linguistics , philosophy , management , economics

We present the results of predictive modelling for the detection of anti-social behaviour in online communication in Arabic, such as comments which contain obscene or offensive words and phrases. We collected and labelled a large dataset of YouTube comments in Arabic which contains a broad range of both offensive and inoffensive comments. We used this dataset to train a Support Vector Machine classifier and experimented with combinations of word-level features, N-gram features and a variety of pre-processing techniques. We summarise the pre-processing steps and features that allow training a classifier which is more precise, with 90.05% accuracy, than classifiers reported by previous studies on Arabic text.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research