User Classification on Online Social Networks by Post Frequency
Author(s) -
Gabriel Marques Tavares,
Saulo Martiello Mastelini,
Sylvio Jr.
Publication year - 2017
Publication title -
anais do simpósio brasileiro de sistemas de informação (sbsi)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5753/sbsi.2017.6076
Subject(s) - boosting (machine learning) , gradient boosting , computer science , random forest , support vector machine , artificial intelligence , machine learning , task (project management) , statistical classification , social media , data mining , world wide web , engineering , systems engineering
This paper proposes a technique for classifying user accounts on social networks to detect fraud in Online Social Networks (OSN). The main purpose of our classification is to recognize the patterns of users from Human, Bots or Cyborgs. Classic and consolidated approaches of Text Mining employ textual features from Natural Language Processing (NLP) for classification, but some drawbacks as computational cost, the huge amount of data could rise in real-life scenarios. This work uses an approach based on statistical frequency parameters of the user posting to distinguish the types of users without textual content. We perform the experiment over a Twitter dataset and as learn-based algorithms in classification task we compared Random Forest (RF), Support Vector Machine (SVM), k-nearest Neighbors (k-NN), Gradient Boosting Machine (GBM) and Extreme Gradient Boosting (XGBoost). Using the standard parameters of each algorithm, we achieved accuracy results of 88% and 84% by RF and XGBoost, respectively
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom