Premium
Quantifying how diagnostic test accuracy depends on threshold in a meta‐analysis
Author(s) -
Jones Hayley E.,
Gatsonsis Constantine A.,
Trikalinos Thomas A.,
Welton Nicky J.,
Ades A.E.
Publication year - 2019
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.8301
Subject(s) - statistics , sensitivity (control systems) , computer science , multinomial distribution , power transform , sample size determination , threshold limit value , econometrics , data mining , mathematics , medicine , artificial intelligence , consistency (knowledge bases) , environmental health , electronic engineering , engineering
Tests for disease often produce a continuous measure, such as the concentration of some biomarker in a blood sample. In clinical practice, a threshold C is selected such that results, say, greater than C are declared positive and those less than C negative. Measures of test accuracy such as sensitivity and specificity depend crucially on C , and the optimal value of this threshold is usually a key question for clinical practice. Standard methods for meta‐analysis of test accuracy (i) do not provide summary estimates of accuracy at each threshold, precluding selection of the optimal threshold, and furthermore, (ii) do not make use of all available data. We describe a multinomial meta‐analysis model that can take any number of pairs of sensitivity and specificity from each study and explicitly quantifies how accuracy depends on C . Our model assumes that some prespecified or Box‐Cox transformation of test results in the diseased and disease‐free populations has a logistic distribution. The Box‐Cox transformation parameter can be estimated from the data, allowing for a flexible range of underlying distributions. We parameterise in terms of the means and scale parameters of the two logistic distributions. In addition to credible intervals for the pooled sensitivity and specificity across all thresholds, we produce prediction intervals, allowing for between‐study heterogeneity in all parameters. We demonstrate the model using two case study meta‐analyses, examining the accuracy of tests for acute heart failure and preeclampsia. We show how the model can be extended to explore reasons for heterogeneity using study‐level covariates.