Stability analysis of feature ranking techniques in the presence of noise: a comparative study

Iman Ramezani, Mojtaba Khorram Niaki*, Milad Dehghani, Mostafa Rezapour

*Corresponding author for this work

    Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

    4 Citations (Scopus)

    Abstract

    Noisy data is one of the common problems associated with real-world data, and may affects the performance of the data models, consequent decisions and the performance of feature ranking techniques. In this paper, we show how stability performance can be changed if different feature ranking methods against attribute noise and class noise are used. We consider Kendall's Tau rank correlation and Spearman rank correlation to evaluate various feature ranking methods stability, and quantify the degree of agreement between ordered lists of features created by a filter on a clean dataset and its outputs on the same dataset corrupted with different combinations of the noise level. According to the results of Kendall and Spearman measures, Gini index (GI) and information gain (IG) have the best performances respectively. Nevertheless, both Kendall and Spearman measures results show that ReliefF (RF) is the most sensitive (the worst) performance.
    Original languageEnglish
    Pages (from-to)413-427
    JournalInternational Journal of Business Intelligence and Data Mining
    Volume17
    Issue number4
    Online published28 Apr 2020
    DOIs
    Publication statusPublished - 2020

    Research Keywords

    • Attribute noise
    • Class noise
    • Filter-based feature ranking
    • Kendall's Tau rank correlation
    • Spearman rank correlation
    • Stability
    • Threshold-based feature ranking

    Fingerprint

    Dive into the research topics of 'Stability analysis of feature ranking techniques in the presence of noise: a comparative study'. Together they form a unique fingerprint.

    Cite this