Finding Theme Communities from Database Networks

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

1 Scopus Citations
View graph of relations

Author(s)

  • Lingyang Chu
  • Zhefeng Wang
  • Jian Pei
  • Yanyan Zhang
  • Enhong Chen

Detail(s)

Original languageEnglish
Pages (from-to)1071-1084
Journal / PublicationProceedings of the VLDB Endowment
Volume12
Issue number10
Publication statusPublished - Jun 2019
Externally publishedYes

Link(s)

Abstract

Given a database network where each vertex is associated with a transaction database, we are interested in finding theme communities. Here, a theme community is a cohesive subgraph such that a common pattern is frequent in all transaction databases associated with the vertices in the subgraph. Finding all theme communities from a database network enjoys many novel applications. However, it is challenging since even counting the number of all theme communities in a database network is #P-hard. Inspired by the observation that a theme community shrinks when the length of the pattern increases, we investigate several properties of theme communities and develop TCFI, a scalable algorithm that uses these properties to effectively prune the patterns that cannot form any theme community. We also design TC-Tree, a scalable algorithm that decomposes and indexes theme communities efficiently. Retrieving a ranked list of theme communities from a TC-Tree of hundreds of millions of theme communities takes less than 1 second. Extensive experiments and a case study demonstrate the effectiveness and scalability of TCFI and TC-Tree in discovering and querying meaningful theme communities from large database networks.

Research Area(s)

Citation Format(s)

Finding Theme Communities from Database Networks. / Chu, Lingyang; Wang, Zhefeng; Pei, Jian et al.

In: Proceedings of the VLDB Endowment, Vol. 12, No. 10, 06.2019, p. 1071-1084.

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

Download Statistics

No data available