Web Mining Techniques to Block Spam Web Sites
Author(s) -
M. Esraa,
A. F.,
Erin J. Hanan
Publication year - 2018
Publication title -
international journal of computer applications
Language(s) - English
Resource type - Journals
ISSN - 0975-8887
DOI - 10.5120/ijca2018917622
Subject(s) - computer science , block (permutation group theory) , world wide web , web application , spamdexing , web mining , database , information retrieval , data mining , the internet , web development , web page , web search engine , geometry , mathematics
The aim of this paper is to introduce a system based on web mining techniques to prevent spamming web pages. The system relies on content analysis, used features are Uniform Resource Locator(URL), Number of words in page Title, Globally Popular Keywords(GPK) and N-GRAM. The proposed system used Decision Tree(DT) rules ; which is the best classifier to detect Web spam content. It produces accuracy of .97 % in detecting spam web sites.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom