Premium
A normalization method for combination of laboratory test results from different electronic healthcare databases in a distributed research network
Author(s) -
Yoon Dukyong,
Schuemie Martijn J.,
Kim Ju Han,
Kim Dong Ki,
Park Man Young,
Ahn Eun Kyoung,
Jung EunYoung,
Park Dong Kyun,
Cho Soo Yeon,
Shin Dahye,
Hwang Yeonsoo,
Park Rae Woong
Publication year - 2016
Publication title -
pharmacoepidemiology and drug safety
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.023
H-Index - 96
eISSN - 1099-1557
pISSN - 1053-8569
DOI - 10.1002/pds.3893
Subject(s) - medicine , normalization (sociology) , hematocrit , observational study , creatinine , retrospective cohort study , statistics , blood urea nitrogen , database normalization , population , standard deviation , mathematics , environmental health , cluster analysis , sociology , anthropology
Purpose Distributed research networks (DRNs) afford statistical power by integrating observational data from multiple partners for retrospective studies. However, laboratory test results across care sites are derived using different assays from varying patient populations, making it difficult to simply combine data for analysis. Additionally, existing normalization methods are not suitable for retrospective studies. We normalized laboratory results from different data sources by adjusting for heterogeneous clinico‐epidemiologic characteristics of the data and called this the subgroup‐adjusted normalization (SAN) method. Methods Subgroup‐adjusted normalization renders the means and standard deviations of distributions identical under population structure‐adjusted conditions. To evaluate its performance, we compared SAN with existing methods for simulated and real datasets consisting of blood urea nitrogen, serum creatinine, hematocrit, hemoglobin, serum potassium, and total bilirubin. Various clinico‐epidemiologic characteristics can be applied together in SAN. For simplicity of comparison, age and gender were used to adjust population heterogeneity in this study. Results In simulations, SAN had the lowest standardized difference in means (SDM) and Kolmogorov–Smirnov values for all tests ( p < 0.05). In a real dataset, SAN had the lowest SDM and Kolmogorov–Smirnov values for blood urea nitrogen, hematocrit, hemoglobin, and serum potassium, and the lowest SDM for serum creatinine ( p < 0.05). Conclusion Subgroup‐adjusted normalization performed better than normalization using other methods. The SAN method is applicable in a DRN environment and should facilitate analysis of data integrated across DRN partners for retrospective observational studies. Copyright © 2015 John Wiley & Sons, Ltd.