Skip to main navigation Skip to search Skip to main content

Generalized Pair-Counting Similarity Measures for Clustering and Cluster Ensembles

  • Shaohong Zhang*
  • , Zongbao Yang
  • , Xiaofei Xing
  • , Ying Gao
  • , Dongqing Xie
  • , Hau-San Wong
  • *Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

44 Downloads (CityUHK Scholars)

Abstract

In this paper, a number of pair-counting similarity measures associated with a general formulation of cluster ensemble are proposed. These measures are formulated based on our motivation to evaluate the consistency between an individual clustering solution and a cluster ensemble solution, or that between different cluster ensemble solutions, in a uniform manner. A number of criteria are proposed for the comparison of these generalized measures, from both the perspectives of theoretical analysis and experimental validation. We identify their different behaviors and their correlations in different scenarios of traditional clustering solutions and cluster ensembles, with the hope that the results of these studies could 1) serve as important criteria for the design and selection of evaluation measures for clustering solutions, and 2) provide explanations for ambiguous clustering results in related scenarios. Experiments with both synthetic and real data sets are conducted to verify our findings.
Original languageEnglish
Article number8012357
Pages (from-to)16904-16918
JournalIEEE Access
Volume5
Online published17 Aug 2017
DOIs
Publication statusPublished - 2017

Research Keywords

  • cluster ensembles
  • Clustering evaluation
  • similarity measures

Publisher's Copyright Statement

  • © 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission.

Fingerprint

Dive into the research topics of 'Generalized Pair-Counting Similarity Measures for Clustering and Cluster Ensembles'. Together they form a unique fingerprint.

Cite this