z-logo
open-access-imgOpen Access
Image Spam Detection Using Machine Learning and Natural Language Processing
Author(s) -
Yaseen Khather Yaseen,
Alaa Khudhair Abbas,
Ahmed M. Sana
Publication year - 2020
Publication title -
xi'nan jiaotong daxue xuebao
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.308
H-Index - 21
ISSN - 0258-2724
DOI - 10.35741/issn.0258-2724.55.2.41
Subject(s) - computer science , artificial intelligence , slang , natural language processing , character (mathematics) , optical character recognition , machine learning , natural language , bag of words model , image (mathematics) , pattern recognition (psychology) , speech recognition , philosophy , linguistics , geometry , mathematics
Today, images are a part of communication between people. However, images are being used to share information by hiding and embedding messages within it, and images that are received through social media or emails can contain harmful content that users are not able to see and therefore not aware of. This paper presents a model for detecting spam on images. The model is a combination of optical character recognition, natural language processing, and the machine learning algorithm. Optical character recognition extracts the text from images, and natural language processing uses linguistics capabilities to detect and classify the language, to distinguish between normal text and slang language. The features for selected images are then extracted using the bag-of-words model, and the machine learning algorithm is run to detect any kind of spam that may be on it. Finally, the model can predict whether or not the image contains any harmful content. The results show that the proposed method using a combination of the machine learning algorithm, optical character recognition, and natural language processing provides high detection accuracy compared to using machine learning alone.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here