The Impact of Feature Selection on Web Spam Detection | Zendy

Jaber Karimpour | Zendy; Ali Noroozi | Zendy; Adeleh Abadi | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

The Impact of Feature Selection on Web Spam Detection

Author(s) -

Jaber Karimpour,

Ali Noroozi,

Adeleh Abadi

Publication year - 2012

Publication title -

international journal of intelligent systems and applications

Language(s) - English

Resource type - Journals

eISSN - 2074-9058

pISSN - 2074-904X

DOI - 10.5815/ijisa.2012.09.08

Subject(s) - spamming , spamdexing , computer science , feature selection , deep web , data mining , set (abstract data type) , search engine optimization , selection (genetic algorithm) , search engine , machine learning , feature (linguistics) , rank (graph theory) , genetic algorithm , artificial intelligence , information retrieval , the internet , web search engine , world wide web , web search query , linguistics , philosophy , mathematics , combinatorics , programming language

Search engine is one of the most important tools for managing the massive amount of distributed web content. Web spamming tries to deceive search engines to rank some pages higher than they deserve. Many methods have been proposed to combat web spamming and to detect spam pages. One basic one is using classification, i.e., learning a classification model for classifying web pages to spam or non-spam. This work tries to select the best feature set for classification of web spam using imperialist competitive algorithm and genetic algorithm. Imperialist competitive algorithm is a novel optimization algorithm that is inspired by socio-political process of imperialism in the real world. Experiments are carried out on WEBSPAM- UK2007 data set, which show feature selection improves classification accuracy, and imperialist competitive algorithm outperforms GA.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research