z-logo
open-access-imgOpen Access
Effective spam filter based on a hybrid method of header checking and content parsing
Author(s) -
Chu KoTsung,
Hsu HuaTing,
Sheu JyhJian,
Yang WeiPang,
Lee ChengChi
Publication year - 2020
Publication title -
iet networks
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.466
H-Index - 21
eISSN - 2047-4962
pISSN - 2047-4954
DOI - 10.1049/iet-net.2019.0191
Subject(s) - header , computer science , naive bayes classifier , artificial intelligence , machine learning , decision tree , phishing , parsing , support vector machine , bloom filter , filter (signal processing) , data mining , boosting (machine learning) , association rule learning , world wide web , the internet , algorithm , computer network , computer vision
In recent years, hazardous e‐mails arose, such as the e‐mails infected with ‘viruses’ or ‘worms’ spreading destructive programs and the ‘Phishing Mails’ defrauding e‐mail accounts of the users. The number of spams continue to grow. With the related problems of spam coming to be more severe, the spam topics have become significant in various research domains. The common filtering methods include black/white list, rule learning, and those based on text classification, such as Naïve Bayes, support vector machine, and boosting trees, multi‐agent and genetic algorithm. Among these, the methods based on text classification are most widely applied. Moreover, some efficient methods were proposed to consider only the e‐mail's header section, based on which both operating efficiency and classification efficiency could be improved. By applying machine learning technique and decision tree data mining algorithm C4.5, this study aims to propose an efficient spam filtering method with the following features: (i) proposing a two‐phase filtering mechanism to scan mainly e‐mail's header and auxiliary content. (ii) Reducing the problem of false positive. The experimental results show that the authors’ method has a considerably high accuracy rate of 98.76%. Compared with some other methods of using the same spam data sets or of deep learning‐based, their method obviously has an excellent performance.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here