z-logo
open-access-imgOpen Access
Weighted k-Nearest Neighbour for Image Spam Classification
Author(s) -
Ahmad Mahdi Salih,
Ban N. Dhannoon
Publication year - 2021
Publication title -
iraqi journal of science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.152
H-Index - 4
eISSN - 2312-1637
pISSN - 0067-2904
DOI - 10.24996/ijs.2021.62.3.32
Subject(s) - spamming , computer science , artificial intelligence , nearest neighbour , benchmark (surveying) , classifier (uml) , pattern recognition (psychology) , k nearest neighbors algorithm , data mining , image (mathematics) , schema (genetic algorithms) , machine learning , the internet , world wide web , geodesy , geography
E-mail is an efficient and reliable data exchange service. Spams are undesired e-mail messages which are randomly sent in bulk usually for commercial aims. Obfuscated image spamming is one of the new tricks to bypass text-based and Optical Character Recognition (OCR)-based spam filters. Image spam detection based on image visual features has the advantage of efficiency in terms of reducing the computational cost and improving the performance. In this paper, an image spam detection schema is presented. Suitable image processing techniques were used to capture the image features that can differentiate spam images from non-spam ones. Weighted k-nearest neighbor, which is a simple, yet powerful, machine learning algorithm, was used as a classifier. The results confirm the effectiveness of the proposed schema as it is evaluated over two datasets. The first dataset is a real and benchmark dataset while the other is a real-like, modern, and more challenging dataset collected from social media and many public available image spam datasets. The obtained accuracy was 99.36% and 91% on benchmark and the proposed dataset, respectively.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here