
A Semi-Supervised Learning Approach for Tackling Twitter Spam Drift
Author(s) -
Priyanka Pramila R.,
J Bhuvana
Publication year - 2022
Publication title -
international journal for research in applied science and engineering technology
Language(s) - English
Resource type - Journals
ISSN - 2321-9653
DOI - 10.22214/ijraset.2022.40397
Subject(s) - computer science , machine learning , artificial intelligence , supervised learning , domain (mathematical analysis) , concept drift , labeled data , semi supervised learning , data mining , data stream mining , artificial neural network , mathematical analysis , mathematics
Twitter play an important role in accelerating the spread of spam. In order to protect the users, Twitter and the research community have been developing different spam detection systems by applying different machine-learning techniques. However, a recent study showed that the current machine learning-based detection systems are not able to detect spam accurately because spam tweet characteristics vary over time. This issue is called "Twitter Spam Drift". In the proposed system a semi-supervised learning approach (SSLA) has been proposed to tackle this. The new approach uses the unlabeled data to learn the structure of the domain. To handle the drift, live twitter stream of data is taken for the study. The pre-processing of livedownloaded data is labeled and machine learning is applied to detect spam and non-spam users. The data is stored in cloud storage, which can be accessed by user from anywhere. Experimental results were conducted on more than one machine learning algorithm and finds the better for the proposed problem, in-terms of accuracy