Premium
Assessing geographic heterogeneity and variable importance in an air pollution data set
Author(s) -
Stanley Young S.,
Xia Jessie Q.
Publication year - 2013
Publication title -
statistical analysis and data mining: the asa data science journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.381
H-Index - 33
eISSN - 1932-1872
pISSN - 1932-1864
DOI - 10.1002/sam.11202
Subject(s) - context (archaeology) , longevity , air quality index , variable (mathematics) , econometrics , air pollution , set (abstract data type) , observational study , data set , regression analysis , variables , geography , computer science , statistics , mathematics , meteorology , machine learning , ecology , medicine , mathematical analysis , programming language , gerontology , archaeology , biology
In this article, we examine data on the relationship between air quality and mortality in the United States using a published observational data set. Observational studies are complex and open to various interpretations. We show that there is geographic heterogeneity for the effect of air pollution on longevity. We also show that the relative importance of air pollution on longevity is much less than that of income or smoking. Most often authors do not address the relative importance of variables under consideration, choosing instead to concentrate on specific claims of significance. Yet good policy decisions require knowledge of the magnitude of relevant effects. Our analysis uses three methods for determining variable importance, showing how this puts predictor variables into a context that supports sound environmental policymaking. In particular, using both regression and recursive partitioning, we are able to confirm a spatial interaction with the air quality variable PM2.5; there is no significant association of PM2.5 with longevity in the west of the United States. We also determine the relative importance of PM2.5 in comparison to other predictor variables available in this data set. Our findings call into question the claim made by the original researchers. © 2013 Wiley Periodicals, Inc. Statistical Analysis and Data Mining, 2013