TY - JOUR
T1 - Efficient parallel processing of high-dimensional spatial kNN queries
AU - Jiang, Tao
AU - Zhang, Bin
AU - Lin, Dan
AU - Gao, Yunjun
AU - Li, Qing
PY - 2022/5/2
Y1 - 2022/5/2
N2 - Some efficient top-k algorithms, i.e., Fagin's Algorithm, threshold algorithm (TA), and best position algorithm (BPA), can be used to answer k nearest neighbor (kNN) queries. However, extending the existing algorithms without further changes to the algorithms themselves would not be efficient since there are the different characteristics between the kNN queries and top-k queries. For example, the kNN queries are more distance-sensitive rather than the position of data points. Second, it is necessary to add some novel parallel heuristics and pruning policies for the kNN queries. Third, there are still many redundant random accesses among FA, TA, and BPA. In this paper, we address aforementioned these problems and take these algorithms to answer parallel kNN (PkNN) queries in spatial databases. We integrate the advantages of the B + -tree and Open MP programming and propose three efficient parallel kNN query algorithms, namely distance priority-based PkNN, optimized PkNN, and partition-based PkNN. Our performance evaluation shows that our proposed algorithms achieve significant improvement in comparison with existing algorithms, i.e., BPA and BPA2. In addition, our approaches are also capable of returning kNN results incrementally which greatly shorten the query response time and enhance user experience.
AB - Some efficient top-k algorithms, i.e., Fagin's Algorithm, threshold algorithm (TA), and best position algorithm (BPA), can be used to answer k nearest neighbor (kNN) queries. However, extending the existing algorithms without further changes to the algorithms themselves would not be efficient since there are the different characteristics between the kNN queries and top-k queries. For example, the kNN queries are more distance-sensitive rather than the position of data points. Second, it is necessary to add some novel parallel heuristics and pruning policies for the kNN queries. Third, there are still many redundant random accesses among FA, TA, and BPA. In this paper, we address aforementioned these problems and take these algorithms to answer parallel kNN (PkNN) queries in spatial databases. We integrate the advantages of the B + -tree and Open MP programming and propose three efficient parallel kNN query algorithms, namely distance priority-based PkNN, optimized PkNN, and partition-based PkNN. Our performance evaluation shows that our proposed algorithms achieve significant improvement in comparison with existing algorithms, i.e., BPA and BPA2. In addition, our approaches are also capable of returning kNN results incrementally which greatly shorten the query response time and enhance user experience.
KW - Nearest neighbor queries
KW - Parallel
KW - Algorithm
KW - Performance
KW - Spatial database
KW - NEAREST-NEIGHBOR SEARCH
KW - TREE
UR - http://gateway.isiknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=LinksAMR&SrcApp=PARTNER_APP&DestLinkType=FullRecord&DestApp=WOS&KeyUT=000789729700002
U2 - 10.1007/s00500-022-07081-0
DO - 10.1007/s00500-022-07081-0
M3 - RGC 21 - Publication in refereed journal
SN - 1432-7643
JO - Soft Computing
JF - Soft Computing
ER -