Premium
Computer‐aided diagnosis with potential application to rapid detection of disease outbreaks
Author(s) -
Burr Tom,
Koster Frederick,
Picard Rick,
Forslund Dave,
Wokoun Doug,
Joyce Ed,
Brillman Judith,
Froman Phil,
Lee Jack
Publication year - 2007
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.2798
Subject(s) - outbreak , computer science , disease , statistics , virology , medicine , pathology , mathematics
Abstract Our objectives are to quickly interpret symptoms of emergency patients to identify likely syndromes and to improve population‐wide disease outbreak detection. We constructed a database of 248 syndromes, each syndrome having an estimated probability of producing any of 85 symptoms, with some two‐way, three‐way, and five‐way probabilities reflecting correlations among symptoms. Using these multi‐way probabilities in conjunction with an iterative proportional fitting algorithm allows estimation of full conditional probabilities. Combining these conditional probabilities with misdiagnosis error rates and incidence rates via Bayes theorem, the probability of each syndrome is estimated. We tested a prototype of computer‐aided differential diagnosis (CADDY) on simulated data and on more than 100 real cases, including West Nile Virus, Q fever, SARS, anthrax, plague, tularaemia and toxic shock cases. We conclude that: (1) it is important to determine whether the unrecorded positive status of a symptom means that the status is negative or that the status is unknown; (2) inclusion of misdiagnosis error rates produces more realistic results; (3) the naive Bayes classifier, which assumes all symptoms behave independently, is slightly outperformed by CADDY, which includes available multi‐symptom information on correlations; as more information regarding symptom correlations becomes available, the advantage of CADDY over the naive Bayes classifier should increase; (4) overlooking low‐probability, high‐consequence events is less likely if the standard output summary is augmented with a list of rare syndromes that are consistent with observed symptoms, and (5) accumulating patient‐level probabilities across a larger population can aid in biosurveillance for disease outbreaks. Copyright © 2007 John Wiley & Sons, Ltd.