TY - GEN
T1 - On weight-noise-injection training
AU - Ho, Kevin
AU - Leung, Chi-Sing
AU - Sum, John
PY - 2009
Y1 - 2009
N2 - While injecting weight noise during training has been proposed for more than a decade to improve the convergence, generalization and fault tolerance of a neural network, not much theoretical work has been done to its convergence proof and the objective function that it is minimizing. By applying the Gladyshev Theorem, it is shown that the convergence of injecting weight noise during training an RBF network is almost sure. Besides, the corresponding objective function is essentially the mean square errors (MSE). This objective function indicates that injecting weight noise during training an radial basis function (RBF) network is not able to improve fault tolerance. Despite this technique has been effectively applied to multilayer perceptron, further analysis on the expected update equation of training MLP with weight noise injection is presented. The performance difference between these two models by applying weight injection is discussed. © 2009 Springer Berlin Heidelberg.
AB - While injecting weight noise during training has been proposed for more than a decade to improve the convergence, generalization and fault tolerance of a neural network, not much theoretical work has been done to its convergence proof and the objective function that it is minimizing. By applying the Gladyshev Theorem, it is shown that the convergence of injecting weight noise during training an RBF network is almost sure. Besides, the corresponding objective function is essentially the mean square errors (MSE). This objective function indicates that injecting weight noise during training an radial basis function (RBF) network is not able to improve fault tolerance. Despite this technique has been effectively applied to multilayer perceptron, further analysis on the expected update equation of training MLP with weight noise injection is presented. The performance difference between these two models by applying weight injection is discussed. © 2009 Springer Berlin Heidelberg.
UR - http://www.scopus.com/inward/record.url?scp=70349152622&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-70349152622&origin=recordpage
U2 - 10.1007/978-3-642-03040-6_112
DO - 10.1007/978-3-642-03040-6_112
M3 - RGC 32 - Refereed conference paper (with host publication)
SN - 3642030394
SN - 9783642030390
VL - 5507 LNCS
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 919
EP - 926
BT - Advances in Neuro-Information Processing
PB - Springer Verlag
T2 - 15th International Conference on Neuro-Information Processing, ICONIP 2008
Y2 - 25 November 2008 through 28 November 2008
ER -