Algorithmic Foundations of Distributed Large-Scale Graph Processing

  • LI, Minming (Principal Investigator / Project Coordinator)
  • Konrad, Christian (Co-Investigator)
  • ROBINSON, Peter (Co-Investigator)

Project: Research

Project Details

Description

Achieving efficient computation on massive graph data sets is a fundamental challenge with numerous applications, ranging from social network analysis and machine learning to protein interaction graphs studied in computational biology. The scale of such massive graphs, which often exceeds the memory and bandwidth capabilities of a single machine has prompted practitioners to develop distributed graph processing frameworks such as Google Pregel, Apache Spark/GraphX and Giraph. The overarching goal of this project is to develop the theoretical foundations of algorithms for efficient distributed computation on massive graphs in computing models that capture the main features of real-world graph processing frameworks. This includes the design of new distributed algorithms that scale to massive graphs for problems such as computing shortest distances between vertices, computing maximal independent sets, and finding sparse subgraphs (i.e. with few edges) that exhibit certain properties. Given that real-world data sets are continuously evolving, we also plan to derive new algorithms that can handle dynamically-changing input graphs. The obtained algorithms will be experimentally evaluated by implementing them in open source graph processing frameworks, which will have a tangible impact on industrial practice.A crucial challenge for performing fast distributed computation on large graphs is to come up with algorithms that do not send too many messages, i.e., are communication-efficient. We will explore new algorithmic techniques to overcome this obstacle and we anticipate that this also provides insights on designing communication- and time-efficient solutions for classic distributed computing models, where each graph vertex represents a computational entity.A complementary goal of the project is to understand how much communication a distributed graph algorithm needs to perform, while still being able to quickly arrive at a correct solution. More specifically, this work will reveal the complexity lower bounds on the trade-offs between time and the required communication in distributed computing models, which is still an open question for many fundamental graph problems. To place our results in a broader context, we will investigate the complexity-theoretic foundations of problems in the large-scale graph setting by developing complexity classes and providing simulations between different computational models for massive data sets.
Project number9042983
Grant typeGRF
StatusFinished
Effective start/end date1/01/215/06/25

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.
  • Altruism in Facility Location Problems

    Zhou, H., Chan, H. & Li, M., Feb 2024, Thirty-Eighth AAAI Conference on Artificial Intelligence, Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, Fourteenth Symposium on Educational Advances in Artificial Intelligence. Wooldridge, M., Dy, J. & Natarajan, S. (eds.). Washington, DC: AAAI Press, p. 9993-10001 9 p. (Proceedings of the AAAI Conference on Artificial Intelligence; vol. 38, no. 9).

    Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

    Open Access
  • Dynamic Maximal Matching in Clique Networks

    Li, M., Robinson, P. & Zhu, X., 2024, 15th Innovations in Theoretical Computer Science Conference (ITCS 2024). Guruswami, V. (ed.). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Vol. 287. p. 73:1-73:21

    Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

    Open Access
    File
    19 Downloads (CityUHK Scholars)
  • Facility Location Games with Task Allocation

    Gong, Z., Li, M. & Zhou, H., 2024, AAMAS ’24: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems. Richland, SC: International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS), p. 2285-2287 (Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS).

    Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review