Premium
Combining multiple biomarkers linearly to maximize the partial area under the ROC curve
Author(s) -
Yan Qingxiang,
Bantis Leonidas E.,
Stanford Janet L.,
Feng Ziding
Publication year - 2017
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.7535
Subject(s) - nonparametric statistics , receiver operating characteristic , parametric statistics , computer science , set (abstract data type) , covariance , kernel (algebra) , rank (graph theory) , biomarker , data mining , statistics , pattern recognition (psychology) , mathematics , artificial intelligence , machine learning , biology , biochemistry , combinatorics , programming language
It is now common in clinical practice to make clinical decisions based on combinations of multiple biomarkers. In this paper, we propose new approaches for combining multiple biomarkers linearly to maximize the partial area under the receiver operating characteristic curve (pAUC). The parametric and nonparametric methods that have been developed for this purpose have limitations. When the biomarker values for populations with and without a given disease follow a multivariate normal distribution, it is easy to implement our proposed parametric approach, which adopts an alternative analytic expression of the pAUC. When normality assumptions are violated, a kernel‐based approach is presented, which handles multiple biomarkers simultaneously. We evaluated the proposed as well as existing methods through simulations and discovered that when the covariance matrices for the disease and nondisease samples are disproportional, traditional methods (such as the logistic regression) are more likely to fail to maximize the pAUC while the proposed methods are more robust. The proposed approaches are illustrated through application to a prostate cancer data set, and a rank‐based leave‐one‐out cross‐validation procedure is proposed to obtain a realistic estimate of the pAUC when there is no independent validation set available.