A study of the transferability of influenza case detection systems between two large healthcare systems
Author(s) -
Ye Ye,
Michael M. Wagner,
Gregory F. Cooper,
Jeffrey P. Ferraro,
Howard Su,
Per H. Gesteland,
Peter J. Haug,
Nicholas Millett,
John M. Aronis,
Andrew Nowalk,
Vı́ctor Ruiz,
Arturo López Pineda,
Lingyun Shi,
Rudy van Bree,
T. N. Ginter,
Fuchiang Tsui
Publication year - 2017
Publication title -
plos one
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.99
H-Index - 332
ISSN - 1932-6203
DOI - 10.1371/journal.pone.0174970
Subject(s) - parsing , computer science , artificial intelligence , classifier (uml) , transfer of learning , naive bayes classifier , natural language processing , machine learning , random forest , bayesian network , transferability , support vector machine , logit
Objectives This study evaluates the accuracy and transferability of Bayesian case detection systems (BCD) that use clinical notes from emergency department (ED) to detect influenza cases. Methods A BCD uses natural language processing (NLP) to infer the presence or absence of clinical findings from ED notes, which are fed into a Bayesain network classifier (BN) to infer patients’ diagnoses. We developed BCDs at the University of Pittsburgh Medical Center (BCD UPMC ) and Intermountain Healthcare in Utah (BCD IH ). At each site, we manually built a rule-based NLP and trained a Bayesain network classifier from over 40,000 ED encounters between Jan. 2008 and May. 2010 using feature selection, machine learning, and expert debiasing approach. Transferability of a BCD in this study may be impacted by seven factors: development (source) institution, development parser, application (target) institution, application parser, NLP transfer, BN transfer, and classification task. We employed an ANOVA analysis to study their impacts on BCD performance. Results Both BCDs discriminated well between influenza and non-influenza on local test cases (AUCs > 0.92). When tested for transferability using the other institution’s cases, BCD UPMC discriminations declined minimally (AUC decreased from 0.95 to 0.94, p<0.01), and BCD IH discriminations declined more (from 0.93 to 0.87, p<0.0001). We attributed the BCD IH decline to the lower recall of the IH parser on UPMC notes. The ANOVA analysis showed five significant factors: development parser, application institution, application parser, BN transfer, and classification task. Conclusion We demonstrated high influenza case detection performance in two large healthcare systems in two geographically separated regions, providing evidentiary support for the use of automated case detection from routinely collected electronic clinical notes in national influenza surveillance. The transferability could be improved by training Bayesian network classifier locally and increasing the accuracy of the NLP parser.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom