Premium
Impacts of Positional Error on Spatial Regression Analysis: A Case Study of Address Locations in Syracuse, New York
Author(s) -
Griffith Daniel A,
Millones Marco,
Vincent Matthew,
Johnson David L,
Hunt Andrew
Publication year - 2007
Publication title -
transactions in gis
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.721
H-Index - 63
eISSN - 1467-9671
pISSN - 1361-1682
DOI - 10.1111/j.1467-9671.2007.01067.x
Subject(s) - geocoding , georeference , geography , regression analysis , regression , computer science , matching (statistics) , geographic information system , cluster analysis , statistics , cartography , identification (biology) , spatial analysis , data mining , remote sensing , mathematics , physical geography , botany , biology
Positional error is the error produced by the discrepancy between reference and recorded locations. In urban landscapes, locations typically are obtained from global positioning systems or geocoding software. Although these technologies have improved the locational accuracy of georeferenced data, they are not error free. This error affects results of any spatial statistical analysis performed with a georeferenced dataset. In this paper we discuss the properties of positional error in an address matching exercise and the allocation of point locations to census geography units. We focus on the error's spatial structure, and more particularly on impacts of error propagation in spatial regression analysis. For this purpose we use two geocoding sources, we briefly describe the magnitude and the nature of their discrepancies, and we evaluate the consequences that this type of locational error has on a spatial regression analysis of pediatric blood lead data for Syracuse, NY. Our findings include: (1) the confirmation of the recurrence of spatial clustering in positional error at various geographic resolutions; and, (2) the identification of a noticeable but not shockingly large impact from positional error propagation in spatial auto‐binomial regression analysis results for the dataset analyzed.