Grid-Based Colocation Mining Algorithms on GPU for Big Spatial Event Data: A Summary of Results
2017; Springer Science+Business Media; Linguagem: Inglês
10.1007/978-3-319-64367-0_14
ISSN1611-3349
Autores Tópico(s)Rough Sets and Fuzzy Logic
ResumoThis paper investigates the colocation pattern mining problem for big spatial event data. Colocation patterns refer to subsets of spatial features whose instances are frequently located together. The problem is important in many applications such as analyzing relationships of crimes or disease with various environmental factors, but is computationally challenging due to a large number of instances, the potentially exponential number of candidate patterns, and high computational cost in generating pattern instances. Existing colocation mining algorithms (e.g., Apriori algorithm, multi-resolution filter, partial join and joinless approaches) are mostly sequential, and thus can be insufficient for big spatial event data. Recently, parallel colocation mining algorithms have been developed based on the Map-reduce framework. However, these algorithms need a large number of nodes to scale up, which is economically expensive, and their reducer nodes have a bottleneck of aggregating all instances of the same colocation patterns. Another work proposes a parallel colocation mining algorithm on GPU based on the iCPI tree and the joinless approach, but assumes that the number of neighbors for each instance is within a small constant, and thus may be inefficient when instances are dense and unevenly distributed. To address these limitations, we propose a grid-based GPU colocation mining algorithm that includes a novel cell aggregate based upper bound filter, and two refinement algorithms. We prove the correctness and completeness of proposed GPU algorithms. Preliminary results on both real world data and synthetic data show that proposed GPU algorithms are promising with over 30 times speedup on up to millions of instances.
Referência(s)