TY - JOUR
T1 - Generalized RLS approach to the training of neural networks
AU - Xu, Yong
AU - Wong, Kwok-Wo
AU - Leung, Chi-Sing
PY - 2006/1
Y1 - 2006/1
N2 - Recursive least square (RLS) is an efficient approach to neural network training. However, in the classical RLS algorithm, there is no explicit decay in the energy function. This will lead to an unsatisfactory generalization ability for the trained networks. In this paper, we propose a generalized RLS (GRLS) model which includes a general decay term in the energy function for the training of feedforward neural networks. In particular, four different weight decay functions, namely, the quadratic weight decay, the constant weight decay and the newly proposed multimodal and quartic weight decay are discussed. By using the GRLS approach, not only the generalization ability of the trained networks is significantly improved but more unnecessary weights are pruned to obtain a compact network. Furthermore, the computational complexity of the GRLS remains the same as that of the standard RLS algorithm. The advantages and tradeoffs of using different decay functions are analyzed and then demonstrated with examples. Simulation results show that our approach is able to meet the design goals: improving the generalization ability of the trained network while getting a compact network. © 2006 IEEE.
AB - Recursive least square (RLS) is an efficient approach to neural network training. However, in the classical RLS algorithm, there is no explicit decay in the energy function. This will lead to an unsatisfactory generalization ability for the trained networks. In this paper, we propose a generalized RLS (GRLS) model which includes a general decay term in the energy function for the training of feedforward neural networks. In particular, four different weight decay functions, namely, the quadratic weight decay, the constant weight decay and the newly proposed multimodal and quartic weight decay are discussed. By using the GRLS approach, not only the generalization ability of the trained networks is significantly improved but more unnecessary weights are pruned to obtain a compact network. Furthermore, the computational complexity of the GRLS remains the same as that of the standard RLS algorithm. The advantages and tradeoffs of using different decay functions are analyzed and then demonstrated with examples. Simulation results show that our approach is able to meet the design goals: improving the generalization ability of the trained network while getting a compact network. © 2006 IEEE.
KW - Extended Kalman filtering (EFK)
KW - Neural network
KW - Recursive least square (RLS) algorithm
KW - Weight decay
UR - http://www.scopus.com/inward/record.url?scp=33144485895&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-33144485895&origin=recordpage
U2 - 10.1109/TNN.2005.860857
DO - 10.1109/TNN.2005.860857
M3 - RGC 22 - Publication in policy or professional journal
C2 - 16526473
SN - 1045-9227
VL - 17
SP - 19
EP - 34
JO - IEEE Transactions on Neural Networks
JF - IEEE Transactions on Neural Networks
IS - 1
ER -