Premium
Mining trauma injury data with imputed values
Author(s) -
Penny Kay,
Chesney Thomas
Publication year - 2009
Publication title -
statistical analysis and data mining: the asa data science journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.381
H-Index - 33
eISSN - 1932-1872
pISSN - 1932-1864
DOI - 10.1002/sam.10044
Subject(s) - missing data , imputation (statistics) , logistic regression , glasgow coma scale , statistics , predictive value , computer science , data mining , medicine , mathematics , surgery
Methods for analyzing trauma injury data with missing values, collected at a UK hospital, are reported. One measure of injury severity, the Glasgow coma score, which is known to be associated with patient death, is missing for 12% of patients in the dataset. In order to include these 12% of patients in the analysis, three different data imputation techniques are used to estimate the missing values. The imputed datasets are analyzed by an artificial neural network and logistic regression, and their results compared in terms of sensitivity, specificity, positive predictive value and negative predictive value. Although there is little distinction between results for the three imputation methods for the overall dataset, the hot‐deck imputation method appears to give more accurate results than the model‐based or propensity score imputation methods, when comparing the subsets of cases including only those patients with imputed Glasgow coma score (GCS) scores. Results show that imputation does not reduce the overall predictive accuracy following a data‐mining analysis; demonstrating that all cases may be included when undertaking analysis of these trauma injury data. Copyright © 2009 Wiley Periodicals, Inc., A Wiley Company