z-logo
open-access-imgOpen Access
Making External Validation Valid for Molecular Classifier Development
Author(s) -
YiLin Wu,
Huei–Chung Huang,
LiXuan Qin
Publication year - 2021
Publication title -
jco precision oncology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.405
H-Index - 22
ISSN - 2473-4284
DOI - 10.1200/po.21.00103
Subject(s) - normalization (sociology) , computer science , classifier (uml) , database normalization , benchmarking , resampling , data mining , artificial intelligence , test data , machine learning , weighting , pattern recognition (psychology) , medicine , marketing , sociology , anthropology , business , radiology , programming language
PURPOSE Accurate assessment of a molecular classifier that guides patient care is of paramount importance in precision oncology. Recent years have seen an increasing use of external validation for such assessment. However, little is known about how it is affected by ubiquitous unwanted variations in test data because of disparate experimental handling and by the use of data normalization for alleviating such variations. METHODS In this paper, we studied these issues using two microarray data sets for the same set of tumor samples and additional data simulated by resampling under various levels of signal-to-noise ratio and different designs for array-to-sample allocation. RESULTS We showed that (1) unwanted variations can lead to biased classifier assessment and (2) data normalization mitigates the bias to varying extents depending on the specific method used. In particular, frozen normalization methods for test data outperform their conventional forms in terms of both reducing the bias in accuracy estimation and increasing robustness to handling effects. We make available our benchmarking tool as an R package on GitHub for performing such evaluation on additional methods for normalization and classification. CONCLUSION Our findings thus highlight the importance of proper test-data normalization for valid assessment by external validation and call for caution on the choice of normalization method for molecular classifier development.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom