Premium
Delaunay triangulation‐based spatial colocation pattern mining without distance thresholds
Author(s) -
Tran Vanha,
Wang Lizhen
Publication year - 2020
Publication title -
statistical analysis and data mining: the asa data science journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.381
H-Index - 33
eISSN - 1932-1872
pISSN - 1932-1864
DOI - 10.1002/sam.11457
Subject(s) - delaunay triangulation , enhanced data rates for gsm evolution , k nearest neighbors algorithm , computer science , data mining , pattern recognition (psychology) , feature (linguistics) , voronoi diagram , table (database) , algorithm , feature vector , artificial intelligence , mathematics , linguistics , philosophy , geometry
A spatial colocation pattern is a group of spatial features whose instances frequently appear together in close proximity to each other. The proximity of instances is generally measured by the distance between them. If the distance is smaller than a distance threshold that is specified by users, they have a neighbor relationship. However, it is difficult for users to give a suitable distance threshold and mining results also vary widely with different distance thresholds. In addition, using distance thresholds are hard to accurately obtain neighborhoods of instances in heterogeneous distribution density data sets. In this study, we propose a new method for determining the neighbor relationship of instances in space without the distance threshold based on Delaunay triangulation (DT). We design three filtering strategies, such as a feature invalid edge, a global positive edge, and a local positive edge, to constrain the original DT to accurately extract the neighborhoods of instances in space. Then, a miner called DT‐based colocation (DTC) pattern mining is developed. Different from the traditional algorithms which adopt the time‐consuming generate‐test candidate model, DTC directly collects the table instances of colocation patterns from the constrained DT by building neighboring polygons and filters prevalent patterns. We compare the results mined by DTC with by the traditional algorithms at macrolevel and microlevel on both real and synthetic data sets to prove that the DTC algorithm improves the effectiveness and fineness of mining results.