TY - GEN
T1 - Multilevel data clustering for spatial join processing
AU - Xiao, Jitian
AU - Zhang, Yanchun
AU - Jia, Xiaohua
N1 - Publisher Copyright:
© 2000 IEEE.
PY - 1999/11
Y1 - 1999/11
N2 - The I/O cost of spatial join processing could be very high due to the large sizes of spatial objects and the large number of spatial objects involved. Spatial joins are usually performed by the filter-and-refinement approach. Although there exists a variety of algorithms for realizing the filter step of the join processing for large spatial data sets, not much research has been done to improve the performance of the refinement step. By clustering the output of the filter step, we are able to reduce the total number of times that spatial objects are repeatedly loaded during the refinement step, thus to reduce the I/O cost of the refinement step. In this paper, a multilevel data partitioning approach is proposed to partition objects into clusters for spatial join processing. Whenever the number of objects is greater than a threshold, say a hundred, the objects will be clustered through a multilevel scheme, i.e., first coarsening, then partitioning, and finally uncoarsening back to the original object sets, which can be further partitioned using the known partitioning methods. Experiments have been conducted and the results have shown that our method can save 20-35% of I/O cost compared with the cases where no clustering or a little clustering is done.
AB - The I/O cost of spatial join processing could be very high due to the large sizes of spatial objects and the large number of spatial objects involved. Spatial joins are usually performed by the filter-and-refinement approach. Although there exists a variety of algorithms for realizing the filter step of the join processing for large spatial data sets, not much research has been done to improve the performance of the refinement step. By clustering the output of the filter step, we are able to reduce the total number of times that spatial objects are repeatedly loaded during the refinement step, thus to reduce the I/O cost of the refinement step. In this paper, a multilevel data partitioning approach is proposed to partition objects into clusters for spatial join processing. Whenever the number of objects is greater than a threshold, say a hundred, the objects will be clustered through a multilevel scheme, i.e., first coarsening, then partitioning, and finally uncoarsening back to the original object sets, which can be further partitioned using the known partitioning methods. Experiments have been conducted and the results have shown that our method can save 20-35% of I/O cost compared with the cases where no clustering or a little clustering is done.
UR - http://www.scopus.com/inward/record.url?scp=85118733113&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85118733113&origin=recordpage
U2 - 10.1109/DANTE.1999.844963
DO - 10.1109/DANTE.1999.844963
M3 - RGC 32 - Refereed conference paper (with host publication)
AN - SCOPUS:85118733113
SN - 0-7695-0496-5
T3 - Proceedings - 1999 International Symposium on Database Applications in Non-Traditional Environments, DANTE 1999
SP - 218
EP - 225
BT - Proceedings - 1999 International Symposium on Database Applications in Non-Traditional Environments (DANTE '99)
A2 - Kambayasbi , Yahiko
A2 - Takakura , Hiroki
PB - IEEE
T2 - 1999 International Symposium on Database Applications in Non-Traditional Environments, DANTE 1999
Y2 - 28 November 1999 through 30 November 1999
ER -