Premium
Gene subset selection in microarray data using entropic filtering for cancer classification
Author(s) -
Navarro Félix F. González,
Muñoz Lluís A. Belanche
Publication year - 2009
Publication title -
expert systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.365
H-Index - 38
eISSN - 1468-0394
pISSN - 0266-4720
DOI - 10.1111/j.1468-0394.2008.00489.x
Subject(s) - computer science , feature selection , entropy (arrow of time) , gene selection , microarray analysis techniques , artificial intelligence , multivariate statistics , data mining , feature (linguistics) , set (abstract data type) , pattern recognition (psychology) , selection (genetic algorithm) , data set , machine learning , gene , biology , biochemistry , physics , gene expression , linguistics , philosophy , quantum mechanics , programming language
In this work an entropic filtering algorithm (EFA) for feature selection is described, as a workable method to generate a relevant subset of genes. This is a fast feature selection method based on finding feature subsets that jointly maximize the normalized multivariate conditional entropy with respect to the classification ability of tumours. The EFA is tested in combination with several machine learning algorithms on five public domain microarray data sets. It is found that this combination offers subsets yielding similar or much better accuracies than using the full set of genes. The solutions obtained are of comparable quality to previous results, but they are obtained in a maximum of half an hour computing time and use a very low number of genes.