Premium
Ordinal response prediction using bootstrap aggregation, with application to a high‐throughput methylation data set
Author(s) -
Archer K. J.,
Mas V. R.
Publication year - 2009
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.3707
Subject(s) - computer science , ordinal regression , classifier (uml) , ordinal data , data set , data mining , artificial intelligence , machine learning
Many investigators conducting translational research are performing high‐throughput genomic experiments and then developing multigenic classifiers using the resulting high‐dimensional data set. In a large number of applications, the class to be predicted may be inherently ordinal. Examples of ordinal outcomes include tumor‐node‐metastasis (TNM) stage (I, II, III, IV); drug toxicity evaluated as none, mild, moderate, or severe; and response to treatment classified as complete response, partial response, stable disease, or progressive disease. While one can apply nominal response classification methods to ordinal response data, in doing so some information is lost that may improve the predictive performance of the classifier. This study examined the effectiveness of alternative ordinal splitting functions combined with bootstrap aggregation for classifying an ordinal response. We demonstrate that the ordinal impurity and ordered twoing methods have desirable properties for classifying ordinal response data and both perform well in comparison to other previously described methods. Developing a multigenic classifier is a common goal for microarray studies, and therefore application of the ordinal ensemble methods is demonstrated on a high‐throughput methylation data set. Copyright © 2009 John Wiley & Sons, Ltd.