
Identification of Genes Discriminating Multiple Sclerosis Patients from Controls by Adapting a Pathway Analysis Method
Author(s) -
Lei Zhang,
Linlin Wang,
Pu Tian,
Suyan Tian
Publication year - 2016
Publication title -
plos one
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.99
H-Index - 332
ISSN - 1932-6203
DOI - 10.1371/journal.pone.0165543
Subject(s) - feature selection , microarray analysis techniques , dimensionality reduction , microarray databases , computer science , identification (biology) , gene chip analysis , set (abstract data type) , feature (linguistics) , computational biology , microarray , selection (genetic algorithm) , pattern recognition (psychology) , data mining , data set , artificial intelligence , gene , biology , gene expression , genetics , linguistics , botany , philosophy , programming language
The focus of analyzing data from microarray experiments has shifted from the identification of associated individual genes to that of associated biological pathways or gene sets. In bioinformatics, a feature selection algorithm is usually used to cope with the high dimensionality of microarray data. In addition to those algorithms that use the biological information contained within a gene set as a priori to facilitate the process of feature selection, various gene set analysis methods can be applied directly or modified readily for the purpose of feature selection. Significance analysis of microarray to gene-set reduction analysis (SAM-GSR) algorithm, a novel direction of gene set analysis, is one of such methods. Here, we explore the feature selection property of SAM-GSR and provide a modification to better achieve the goal of feature selection. In a multiple sclerosis (MS) microarray data application, both SAM-GSR and our modification of SAM-GSR perform well. Our results show that SAM-GSR can carry out feature selection indeed, and modified SAM-GSR outperforms SAM-GSR. Given pathway information is far from completeness, a statistical method capable of constructing biologically meaningful gene networks is of interest. Consequently, both SAM-GSR algorithms will be continuously revaluated in our future work, and thus better characterized.