Abstract
Motivated by the increasing need to understand the distributed algorithmic foundations of large-scale graph computations, we study some fundamental graph problems in a message-passing model for distributed computing where k ≥ 2 machines jointly perform computations on graphs with n nodes (typically, n ≫ k). The input graph is assumed to be initially randomly partitioned among the k machines, a common implementation in many real-world systems. Communication is point-to-point, and the goal is to minimize the number of communication rounds of the computation.
Our main contribution is the General Lower Bound Theorem, a theorem that can be used to show non-trivial lower bounds on the round complexity of distributed large-scale data computations. The General Lower Bound Theorem is established via an information-theoretic approach that relates the round complexity to the minimal amount of information required by machines to solve the problem. Our approach is generic and this theorem can be used in a “cookbook" fashion to show distributed lower bounds in the context of several problems, including non-graph problems. We present two applications by showing (almost) tight lower bounds for the round complexity of two fundamental graph problems, namely PageRank computation and triangle enumeration. Our approach, as demonstrated in the case of PageRank, can yield tight lower bounds for problems (including, and especially, under a stochastic partition of the input) where communication complexity techniques are not obvious. Our approach, as demonstrated in the case of triangle enumeration, can yield stronger round lower bounds as well as message-round tradeoffs compared to approaches that use communication complexity techniques.
We then present distributed algorithms for PageRank and triangle enumeration with a round complexity that (almost) matches the respective lower bounds; these algorithms exhibit a round complexity which scales superlinearly in k, improving significantly over previous results for these problems [Klauck et al., SODA 2015]. Specifically, we show the following results:
• PageRank: We show a lower bound of Ω(n/k2) rounds, and present a distributed algorithm that computes the PageRank of all the nodes of a graph in Õ(n/k2) rounds.
• Triangle enumeration: We show that there exist graphs with m edges where any distributed algorithm requires Ω(m/k5/3) rounds. This result also implies the first non-trivial lower bound of Ω(n1/3) rounds for the congested clique model, which is tight up to logarithmic factors. We then present a distributed algorithm that enumerates all the triangles of a graph in Õ(m/k5/3 + n/k4/3) rounds.
Our main contribution is the General Lower Bound Theorem, a theorem that can be used to show non-trivial lower bounds on the round complexity of distributed large-scale data computations. The General Lower Bound Theorem is established via an information-theoretic approach that relates the round complexity to the minimal amount of information required by machines to solve the problem. Our approach is generic and this theorem can be used in a “cookbook" fashion to show distributed lower bounds in the context of several problems, including non-graph problems. We present two applications by showing (almost) tight lower bounds for the round complexity of two fundamental graph problems, namely PageRank computation and triangle enumeration. Our approach, as demonstrated in the case of PageRank, can yield tight lower bounds for problems (including, and especially, under a stochastic partition of the input) where communication complexity techniques are not obvious. Our approach, as demonstrated in the case of triangle enumeration, can yield stronger round lower bounds as well as message-round tradeoffs compared to approaches that use communication complexity techniques.
We then present distributed algorithms for PageRank and triangle enumeration with a round complexity that (almost) matches the respective lower bounds; these algorithms exhibit a round complexity which scales superlinearly in k, improving significantly over previous results for these problems [Klauck et al., SODA 2015]. Specifically, we show the following results:
• PageRank: We show a lower bound of Ω(n/k2) rounds, and present a distributed algorithm that computes the PageRank of all the nodes of a graph in Õ(n/k2) rounds.
• Triangle enumeration: We show that there exist graphs with m edges where any distributed algorithm requires Ω(m/k5/3) rounds. This result also implies the first non-trivial lower bound of Ω(n1/3) rounds for the congested clique model, which is tight up to logarithmic factors. We then present a distributed algorithm that enumerates all the triangles of a graph in Õ(m/k5/3 + n/k4/3) rounds.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 30th ACM Symposium on Parallelism in Algorithms and Architectures |
| Publisher | Association for Computing Machinery |
| Pages | 405-414 |
| ISBN (Print) | 9781450357999 |
| DOIs | |
| Publication status | Published - Jul 2018 |
| Externally published | Yes |
| Event | 30th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2018 - Vienna University of Technology, Vienna, Austria Duration: 16 Jul 2018 → 18 Jul 2018 https://spaa.acm.org/2018/index.html |
Publication series
| Name | Annual ACM Symposium on Parallelism in Algorithms and Architectures |
|---|
Conference
| Conference | 30th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2018 |
|---|---|
| Place | Austria |
| City | Vienna |
| Period | 16/07/18 → 18/07/18 |
| Internet address |
Fingerprint
Dive into the research topics of 'On the distributed complexity of large-scale graph computations'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver