z-logo
open-access-imgOpen Access
414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database
Author(s) -
Leo Meister,
Christa S. Zerbe,
Luigi D. Notarangelo,
Sameer S. Kadri,
D. Rebecca Prevots,
Emily Ricotta
Publication year - 2019
Publication title -
open forum infectious diseases
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.546
H-Index - 35
ISSN - 2328-8957
DOI - 10.1093/ofid/ofz360.487
Subject(s) - medicine , primary immunodeficiency , digeorge syndrome , receiver operating characteristic , immune system , pediatrics , immunology , psychiatry
Background More than 350 genetic disorders cause immune deficiencies; given the rarity of these conditions, in-depth study of infections associated with primary immune deficiencies (PID) requires extremely large sample sizes from broad populations. Using a large electronic health record (EHR) dataset, we linked clinical and microbiologic data to develop digital phenotypes for PID. Methods Using the Cerner HealthFacts EHR dataset from 2009 to 2017 we extracted clinical and microbiologic data for hospitalizations from patients <18 years old with ICD9/10 PID diagnoses and ≥1 positive culture for infection. Machine learning models were used to identify key features to predict PID diagnosis. Features included patient and hospitalization characteristics; infectious agent and infection site; and selected comorbidities. Model validation was done using the area under the receiver operating characteristic (AUC) curve. Results Overall 1316 patients with a PID were identified (Table 1). The 10 most common pathogens identified by PID are listed in Table 2. The models classified DiGeorge syndrome (positive predictive value 49%), functional disorders of polymorphonuclear neutrophils (PMN) (PPV 43%), and common variable immunodeficiency (CVID) (PPV 47%) better than combined immunodeficiency (CID) (PPV 20%); the overall true positive rate was 47% with an AUC of 0.73. Predictive features for each PID were as follows: CVID—having enteritis, hypertension, and pneumonia (Figure 1a); PMN—having hypoxia and hypertension (Figure 1b); DiGeorge syndrome—having congenital deformities and not having hypertension (Figure 1c); CID—finding Staphylococcus aureus in a wound or Escherichia coli in the blood were predictive of CID (Figure 1d). Conclusion Early models demonstrate some discrimination, specifically for more common PIDs (CVID) and those with highly identifying factors (DiGeorge syndrome). These models can be improved by including a wider array of clinical data, and they provide a first look at a new methodology to digitally phenotype PIDs for future diagnostic use. Disclosures All authors: No reported disclosures.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom