Premium
Estimating prediction error in microarray classification: Modifications of the 0.632+ bootstrap when ${\bf n} < {\bf p}$
Author(s) -
Jiang Wenyu,
Chen Bingshu E.
Publication year - 2013
Publication title -
canadian journal of statistics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.804
H-Index - 51
eISSN - 1708-945X
pISSN - 0319-5724
DOI - 10.1002/cjs.11158
Subject(s) - resampling , statistics , mathematics , type i and type ii errors , sample size determination , computer science , data mining
We are interested in estimating prediction error for a classification model built on high dimensional genomic data when the number of genes ( p ) greatly exceeds the number of subjects ( n ). We examine a distance argument supporting the conventional 0.632+ bootstrap proposed for the $n > p$ scenario, modify it for the $n < p$ situation and develop learning curves to describe how the true prediction error varies with the number of subjects in the training set. The curves are then applied to define adjusted resampling estimates for the prediction error in order to achieve a balance in terms of bias and variability. The adjusted resampling methods are proposed as counterparts of the 0.632+ bootstrap when $n < p$ , and are found to improve on the 0.632+ bootstrap and other existing methods in the microarray study scenario when the sample size is small and there is some level of differential expression. The Canadian Journal of Statistics 41: 133–150; 2013 © 2012 Statistical Society of Canada