TY - GEN
T1 - Communication-Optimal Distributed Dynamic Graph Clustering
AU - Zhu, Chun Jiang
AU - Zhu, Tan
AU - Lam, Kam-Yiu
AU - Han, Song
AU - Bi, Jinbo
PY - 2019/1
Y1 - 2019/1
N2 - We consider the problem of clustering graph nodes over large-scale dynamic graphs, such as citation networks, images and web networks, when graph updates such as node/edge insertions/deletions are observed distributively. We propose communication-efficient algorithms for two well-established communication models namely the message passing and the blackboard models. Given a graph with n nodes that is observed at s remote sites over time [1, t], the two proposed algorithms have communication costs Õ (ns) and Õ(n + s) (Õ hides a polylogarithmic factor), almost matching their lower bounds, Ω(ns) and Ω(n + s), respectively, in the message passing and the blackboard models. More importantly, we prove that at each time point in [1, t] our algorithms generate clustering quality nearly as good as that of centralizing all updates up to that time and then applying a standard centralized clustering algorithm. We conducted extensive experiments on both synthetic and real-life datasets which confirmed the communication efficiency of our approach over baseline algorithms while achieving comparable clustering results.
AB - We consider the problem of clustering graph nodes over large-scale dynamic graphs, such as citation networks, images and web networks, when graph updates such as node/edge insertions/deletions are observed distributively. We propose communication-efficient algorithms for two well-established communication models namely the message passing and the blackboard models. Given a graph with n nodes that is observed at s remote sites over time [1, t], the two proposed algorithms have communication costs Õ (ns) and Õ(n + s) (Õ hides a polylogarithmic factor), almost matching their lower bounds, Ω(ns) and Ω(n + s), respectively, in the message passing and the blackboard models. More importantly, we prove that at each time point in [1, t] our algorithms generate clustering quality nearly as good as that of centralizing all updates up to that time and then applying a standard centralized clustering algorithm. We conducted extensive experiments on both synthetic and real-life datasets which confirmed the communication efficiency of our approach over baseline algorithms while achieving comparable clustering results.
KW - SPECTRAL SPARSIFICATION
UR - http://www.scopus.com/inward/record.url?scp=85090801589&partnerID=8YFLogxK
UR - http://gateway.isiknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=LinksAMR&SrcApp=PARTNER_APP&DestLinkType=FullRecord&DestApp=WOS&KeyUT=000486572500060
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85090801589&origin=recordpage
U2 - 10.1609/aaai.v33i01.33015957
DO - 10.1609/aaai.v33i01.33015957
M3 - RGC 32 - Refereed conference paper (with host publication)
SN - 978-1-57735-809-1
T3 - Proceedings of the AAAI Conference on Artificial Intelligence
SP - 5957
EP - 5964
BT - AAAI-19 / IAAI-19 / EAAI-19 Proceedings
PB - AAAI Press
CY - Palo Alto, California
T2 - 33rd AAAI Conference on Artificial Intelligence (AAAI 2019), 31st Annual Conference on Innovative Applications of Artificial Intelligence (IAAI 2019) and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI 2019)
Y2 - 27 January 2019 through 1 February 2019
ER -