Abstract
Outlier ranking methods can provide a quantitative measure to evaluate the outlierness of data instances in data clustering and attract great interest in pattern recognition and data mining communities. However, it has been pointed out that the diverse scaling ranges of these scores bring difficulty to result interpretation. Moreover, popular outlier ranking scores based on simple distance measures might not accurately reflect the complex affinity among data points. In this paper, we propose a new outlier ranking method based on consensus affinity of a cluster ensemble. Two new outlier ranking scores generalized from well-known clustering evaluation measures, Rvv from the RAND measure and ARIvv from Adjusted Rand Index (ARI), are adopted for outlierness evaluation. Compared to other outlierness ranking measures, the two new measures have the desired bounds without additional transformations. Consistent with the improvement of Adjusted Rand Index (ARI) over RAND, we find that ARIvv also significantly outperforms Rvv. Benefiting from the consensus affinity of a cluster ensemble, our proposed method with the ARIvv score provides significant improvement beyond a number of competing algorithms on public UCI benchmark data sets. Studies with both theoretical analysis and experimental validation show the effectiveness of our proposed methods.
Original language | English |
---|---|
Title of host publication | Proceedings of the International Joint Conference on Neural Networks |
Publisher | IEEE |
Pages | 1020-1027 |
ISBN (Print) | 9781479914845 |
DOIs | |
Publication status | Published - 3 Sept 2014 |
Event | 2014 International Joint Conference on Neural Networks, IJCNN 2014 - Beijing, China Duration: 6 Jul 2014 → 11 Jul 2014 |
Conference
Conference | 2014 International Joint Conference on Neural Networks, IJCNN 2014 |
---|---|
Country/Territory | China |
City | Beijing |
Period | 6/07/14 → 11/07/14 |