
Fast feature selection using a simple estimation of distribution algorithm: a case study on splice site prediction
Author(s) -
Yvan Saeys,
Sven Degroeve,
Dirk Aeyels,
Yves Van de Peer,
Pierre Rouzé
Publication year - 2003
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btg1076
Subject(s) - discriminative model , feature selection , feature (linguistics) , computer science , preprocessor , heuristic , pattern recognition (psychology) , greedy algorithm , artificial intelligence , set (abstract data type) , selection (genetic algorithm) , focus (optics) , simple (philosophy) , machine learning , data mining , algorithm , philosophy , linguistics , physics , optics , programming language , epistemology
Feature subset selection is an important preprocessing step for classification. In biology, where structures or processes are described by a large number of features, the elimination of irrelevant and redundant information in a reasonable amount of time has a number of advantages. It enables the classification system to achieve good or even better solutions with a restricted subset of features, allows for a faster classification, and it helps the human expert focus on a relevant subset of features, hence providing useful biological knowledge.