Premium
Weighted area under the receiver operating characteristic curve and its application to gene selection
Author(s) -
Li Jialiang,
Fine Jason P.
Publication year - 2010
Publication title -
journal of the royal statistical society: series c (applied statistics)
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.205
H-Index - 72
eISSN - 1467-9876
pISSN - 0035-9254
DOI - 10.1111/j.1467-9876.2010.00713.x
Subject(s) - estimator , weighting , receiver operating characteristic , measure (data warehouse) , parametric statistics , selection (genetic algorithm) , flexibility (engineering) , variance (accounting) , set (abstract data type) , nonparametric statistics , sensitivity (control systems) , computer science , a weighting , mathematics , statistics , algorithm , data mining , artificial intelligence , physics , engineering , accounting , electronic engineering , acoustics , business , programming language
Summary. The partial area under the receiver operating characteristic curve (PAUC) has been proposed for gene selection by Pepe and co‐workers and thereafter applied in real data analysis. It was noticed from empirical studies that this measure has several key weaknesses, such as an inability to reflect non‐uniform weighting of different decision thresholds, resulting in large numbers of ties. We propose the weighted area under the receiver operating characteristic curve (WAUC) to address the problems that are associated with PAUC. Our proposed measure enjoys a greater flexibility to describe the discrimination accuracy of genes. Non‐parametric and parametric estimation methods are introduced, including PAUC as a special case, along with theoretical properties of the estimators. We also provide a simple variance formula, yielding a novel variance estimator for non‐parametric estimation of PAUC, which has proven challenging in previous work. The methods proposed permit sensitivity analyses, whereby the effect of differing weight functions on gene rankings may be assessed and results may be synthesized across weights. Simulations and reanalysis of a well‐known microarray data set illustrate the practical utility of WAUC.