z-logo
open-access-imgOpen Access
IMPLEMENTATION OF HYPERPARAMETER OPTIMISATION AND OVER-SAMPLING IN DETECTING CYBERBULLYING USING MACHINE LEARNING APPROACH
Author(s) -
Wan Noor Hamiza Wan Ali,
Masnizah Mohd,
Fariza Fauzi,
Kohji Shirai,
Muhammad Junaidi Mahamad Noor
Publication year - 2021
Publication title -
malaysian journal of computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.197
H-Index - 18
ISSN - 0127-9084
DOI - 10.22452/mjcs.sp2021no2.6
Subject(s) - computer science , word2vec , hyperparameter , artificial intelligence , social media , word embedding , word (group theory) , machine learning , natural language processing , set (abstract data type) , hyperparameter optimization , support vector machine , information retrieval , world wide web , embedding , linguistics , philosophy , programming language
Online social networks have become a necessity to everyone around the world. Particularly, online social networks have enabled us to connect to one another regardless of time, for as long as we have social media and social networking as platforms for broadcasting information and communicating, respectively. However, this evolution has resulted in people possibly committing various cybercrimes, such as cyberbullying. To address this issue, machine learning can be utilised to counter cyberbullying in online social networks. Thus, this study proposed a framework with a set of features consisting of word and character term frequency–inverse document frequency and word embedding by using Word2vec and six types of list terms: profane words, proper nouns, negation words, ‘allness’ term, diminisher words and intensifier words. These features were divided into four groups before being fed into the linear support vector classifier to train our model using ASKfm as data set in hyperparameter tuning and over-sampling environment. Results indicated that the proposed framework provided significant outcomes, in which the highest percentage of area under curve is 99.24% and F-measure is 97.38% as performed by our trained model.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here