TY - JOUR
T1 - Hybrid clustering solution selection strategy
AU - Yu, Zhiwen
AU - Li, Le
AU - Gao, Yunjun
AU - You, Jane
AU - Liu, Jiming
AU - Wong, Hau-San
AU - Han, Guoqiang
PY - 2014/10
Y1 - 2014/10
N2 - Cluster ensemble approaches make use of a set of clustering solutions which are derived from different data sources to gain a more comprehensive and significant clustering result over conventional single clustering approaches. Unfortunately, not all the clustering solutions in the ensemble contribute to the final result. In this paper, we focus on the clustering solution selection strategy in the cluster ensemble, and propose to view clustering solutions as features such that suitable feature selection techniques can be used to perform clustering solution selection. Furthermore, a hybrid clustering solution selection strategy (HCSS) is designed based on a proposed weighting function, which combines several feature selection techniques for the refinement of clustering solutions in the ensemble. Finally, a new measure is designed to evaluate the effectiveness of clustering solution selection strategies. The experimental results on both UCI machine learning datasets and cancer gene expression profiles demonstrate that HCSS works well on most of the datasets, obtains more desirable final results, and outperforms most of the state-of-the-art clustering solution selection strategies. © 2014 Elsevier Ltd.
AB - Cluster ensemble approaches make use of a set of clustering solutions which are derived from different data sources to gain a more comprehensive and significant clustering result over conventional single clustering approaches. Unfortunately, not all the clustering solutions in the ensemble contribute to the final result. In this paper, we focus on the clustering solution selection strategy in the cluster ensemble, and propose to view clustering solutions as features such that suitable feature selection techniques can be used to perform clustering solution selection. Furthermore, a hybrid clustering solution selection strategy (HCSS) is designed based on a proposed weighting function, which combines several feature selection techniques for the refinement of clustering solutions in the ensemble. Finally, a new measure is designed to evaluate the effectiveness of clustering solution selection strategies. The experimental results on both UCI machine learning datasets and cancer gene expression profiles demonstrate that HCSS works well on most of the datasets, obtains more desirable final results, and outperforms most of the state-of-the-art clustering solution selection strategies. © 2014 Elsevier Ltd.
KW - Cluster ensemble
KW - Clustering solution selection
KW - Feature selection
KW - Hybrid strategy
UR - http://www.scopus.com/inward/record.url?scp=84902369031&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-84902369031&origin=recordpage
U2 - 10.1016/j.patcog.2014.04.005
DO - 10.1016/j.patcog.2014.04.005
M3 - RGC 21 - Publication in refereed journal
SN - 0031-3203
VL - 47
SP - 3362
EP - 3375
JO - Pattern Recognition
JF - Pattern Recognition
IS - 10
ER -