Clustering social audiences in business information networks

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

3 Scopus Citations
View graph of relations

Author(s)

  • Yu Zheng
  • Ruiqi Hu
  • Sai-fu Fung
  • Celina Yu
  • Guodong Long
  • And 2 others
  • Ting Guo
  • Shirui Pan

Detail(s)

Original languageEnglish
Article number107126
Journal / PublicationPattern Recognition
Volume100
Online published28 Nov 2019
Publication statusPublished - Apr 2020

Abstract

Business information networks involve diverse users and rich content and have emerged as important platforms for enabling business intelligence and business decision making. A key step in an organizations business intelligence process is to cluster users with similar interests into social audiences and discover the roles they play within a business network. In this article, we propose a novel machine-learning approach, called CBIN, that co-clusters business information networks to discover and understand these audiences. The CBIN framework is based on co-factorization. The audience clusters are discovered from a combination of network structures and rich contextual information, such as node interactions and node-content correlations. Since what defines an audience cluster is data-driven, plus they often overlap, pre-determining the number of clusters is usually very difficult. Therefore, we have based CBIN on an overlapping clustering paradigm with a hold-out strategy to discover the optimal number of clusters given the underlying data. Experiments validate an outstanding performance by CBIN compared to other state-of-the-art algorithms on 13 real-world enterprise datasets.

Research Area(s)

  • Business information networks, Clustering, Machine learning, Social networks

Citation Format(s)

Clustering social audiences in business information networks. / Zheng, Yu; Hu, Ruiqi; Fung, Sai-fu; Yu, Celina; Long, Guodong; Guo, Ting; Pan, Shirui.

In: Pattern Recognition, Vol. 100, 107126, 04.2020.

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review