TY - JOUR
T1 - Rough set and scatter search metaheuristic based feature selection for credit scoring
AU - Wang, Jue
AU - Hedar, Abdel-Rahman
AU - Wang, Shouyang
AU - Ma, Jian
PY - 2012/5
Y1 - 2012/5
N2 - As the credit industry has been growing rapidly, credit scoring models have been widely used by the financial industry during this time to improve cash flow and credit collections. However, a large amount of redundant information and features are involved in the credit dataset, which leads to lower accuracy and higher complexity of the credit scoring model. So, effective feature selection methods are necessary for credit dataset with huge number of features. In this paper, a novel approach, called RSFS, to feature selection based on rough set and scatter search is proposed. In RSFS, conditional entropy is regarded as the heuristic to search the optimal solutions. Two credit datasets in UCI database are selected to demonstrate the competitive performance of RSFS consisted in three credit models including neural network model, J48 decision tree and Logistic regression. The experimental result shows that RSFS has a superior performance in saving the computational costs and improving classification accuracy compared with the base classification methods. © 2011 Elsevier Ltd. All rights reserved.
AB - As the credit industry has been growing rapidly, credit scoring models have been widely used by the financial industry during this time to improve cash flow and credit collections. However, a large amount of redundant information and features are involved in the credit dataset, which leads to lower accuracy and higher complexity of the credit scoring model. So, effective feature selection methods are necessary for credit dataset with huge number of features. In this paper, a novel approach, called RSFS, to feature selection based on rough set and scatter search is proposed. In RSFS, conditional entropy is regarded as the heuristic to search the optimal solutions. Two credit datasets in UCI database are selected to demonstrate the competitive performance of RSFS consisted in three credit models including neural network model, J48 decision tree and Logistic regression. The experimental result shows that RSFS has a superior performance in saving the computational costs and improving classification accuracy compared with the base classification methods. © 2011 Elsevier Ltd. All rights reserved.
KW - Credit scoring
KW - Feature selection
KW - Meta-heuristics
KW - Rough set
KW - Scatter search
UR - http://www.scopus.com/inward/record.url?scp=84856536378&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-84856536378&origin=recordpage
U2 - 10.1016/j.eswa.2011.11.011
DO - 10.1016/j.eswa.2011.11.011
M3 - RGC 21 - Publication in refereed journal
SN - 0957-4174
VL - 39
SP - 6123
EP - 6128
JO - Expert Systems with Applications
JF - Expert Systems with Applications
IS - 6
ER -