Is cross-validation better than resubstitution for ranking genes? | Zendy

Ulisses Braga-Neto | Zendy; Ronaldo F. Hashimoto | Zendy; Edward R. Dougherty | Zendy; Danh V. Nguyen | Zendy; Raymond J. Carroll | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Is cross-validation better than resubstitution for ranking genes?

Author(s) -

Ulisses Braga-Neto,

Ronaldo F. Hashimoto,

Edward R. Dougherty,

Danh V. Nguyen,

Raymond J. Carroll

Publication year - 2004

Publication title -

bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.599

H-Index - 390

eISSN - 1367-4811

pISSN - 1367-4803

DOI - 10.1093/bioinformatics/btg399

Subject(s) - computer science , classifier (uml) , artificial intelligence , word error rate , probabilistic classification , cross validation , machine learning , linear discriminant analysis , pattern recognition (psychology) , data mining , probabilistic logic , ranking (information retrieval) , linear classifier , support vector machine , naive bayes classifier

Ranking gene feature sets is a key issue for both phenotype classification, for instance, tumor classification in a DNA microarray experiment, and prediction in the context of genetic regulatory networks. Two broad methods are available to estimate the error (misclassification rate) of a classifier. Resubstitution fits a single classifier to the data, and applies this classifier in turn to each data observation. Cross-validation (in leave-one-out form) removes each observation in turn, constructs the classifier, and then computes whether this leave-one-out classifier correctly classifies the deleted observation. Resubstitution typically underestimates classifier error, severely so in many cases. Cross-validation has the advantage of producing an effectively unbiased error estimate, but the estimate is highly variable. In many applications it is not the misclassification rate per se that is of interest, but rather the construction of gene sets that have the potential to classify or predict. Hence, one needs to rank feature sets based on their performance.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research