Premium
A Pilot Study to Improve the Use of Electronic Health Records for Identification of Patients with Social Determinants of Health Challenges: A Collaboration of Johns Hopkins Health System and Kaiser Permanente
Author(s) -
Hatef Elham,
Rouhizadeh Masoud,
Nau Claudia,
Xie Fagen,
Padilla Ariadna,
Lyons Lindsay Joe,
Rouillard Christopher,
AbuNasser Mahmoud,
Roblin Douglas
Publication year - 2021
Publication title -
health services research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.706
H-Index - 121
eISSN - 1475-6773
pISSN - 0017-9124
DOI - 10.1111/1475-6773.13756
Subject(s) - social determinants of health , psychological intervention , documentation , health care , identification (biology) , health equity , social media , medical record , international classification of functioning, disability and health , medicine , computer science , artificial intelligence , public health , political science , nursing , world wide web , rehabilitation , physical therapy , botany , radiology , law , biology , programming language
Research Objective International Classification of Diseases (ICD) coding system have codes for recording of social determinants of health (SDOH); however, documentation of non‐clinical issues in electronic health records (EHRs) is infrequent compared to medical conditions. ICD codes in EHRs for SDOH identification, therefore, may under‐report patients with social needs and risks, which makes it difficult for healthcare systems to target “high risk” patients for interventions addressing social needs. SDOH may be discussed with healthcare providers during visits and, therefore, recorded in EHR free‐text notes (a.k.a, providers' notes). These notes might provide a more accurate accounting of SDOH; however, traditional approaches for review and abstraction of patient information from medical record notes is laborious, expensive, and slow. Recent developments in text mining and natural language processing (NLP) of digitized text allows for reliable, low cost, and rapid extraction of information from EHRs. In this pilot project we evaluated whether an NLP algorithm could extract valid measures of SDOH from Epic‐based EHRs in three healthcare systems: Johns Hopkins Health System (JHHS), Kaiser Permanente Mid‐Atlantic States (KPMAS), and KP Southern California (KPSCcal). The focus of our study was residential instability (i.e., homelessness and housing insecurity). Study Design The study was conducted independently, in a parallel and coordinated framework across sites. The validation assessment and NLP algorithm logic were identical across sites; however, the “gold standard” for assessment of algorithm validity differed according to data availability. Using the EntityRuler module of spaCy 2.3 Python toolkit, we created a rule‐based NLP system made up of 61 expert‐developed patterns that, if present, would represent residential instability. Our patterns included word ‘lemmas’ and base forms to account for morphological variations (e.g., singular and plural forms) as well as substitutions of different prepositions (e.g., about and for), and synonym words (e.g., house, apartment, and home). We calibrated and then validated the algorithm using a split sample approach. Validity was assessed at each site by measures of sensitivity and specificity. Population Studied Beneficiaries ≥18 years of age during 2016 through 2019 who received care at JHHS, KPMAS, KPSCal. Principal Findings The following table presents the characteristics of the study population and performance of the NLP algorithm at each study site.JHHS KPMAS KPScalStudy Population (Patient No.)~1,200,000 ~1,600,000 ~4,700,000NLP ValidationGold Standard Method SDOH Questionnaire SDOH Questionnaire SDOH ICD codes Manual Annotation Sample Size Patients/ Response No. (with/without residential Instability) 1000 (500+/ 500‐) 8197 (833+,7364‐) 300 (150+/150‐)Clinical Note No. 134,062 78,825 9575NLP Algorithm PerformanceSensitivity 0.84 0.61 0.96 Specificity 0.96 0.87 0.97Conclusions The consistent performance of this NLP algorithm to identify residential instability in three different healthcare systems suggests the algorithm is generalizable. The consistent and relatively high sensitivity and specificity demonstrates the algorithm's validity. Implications for Policy or Practice Development of generalizable NLP algorithms with promising performance will enhance the value of EHRs to identify at risk patients across different health systems, to improve patient care and outcomes, and to mitigate socioeconomic disparities across individuals and communities. Primary Funding Source Johns Hopkins and Kaiser Permanente Research Collaboration Committee Pilot Awards.