Abstract
Injecting weight noise during training has been proposed for almost two decades as a simple technique to improve fault tolerance and generalization of a multilayer perceptron (MLP). However, little has been done regarding their convergence behaviors. Therefore, we presents in this paper the convergence proofs of two of these algorithms for MLPs. One is based on combining injecting multiplicative weight noise and weight decay (MWN-WD) during training. The other is based on combining injecting additive weight noise and weight decay (AWN-WD) during training. Let m be the number of hidden nodes of a MLP, α be the weight decay constant and Sb be the noise variance. It is showed that the convergence of MWN-WD algorithm is with probability one if α > √Sbm. While the convergence of the AWN-WD algorithm is with probability one if α > 0. © 2010 IEEE.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2010 |
| Pages | 358-365 |
| DOIs | |
| Publication status | Published - 2010 |
| Event | 2010 15th Conference on Technologies and Applications of Artificial Intelligence, TAAI 2010 - Hsinchu, Taiwan, China Duration: 18 Nov 2010 → 20 Nov 2010 |
Conference
| Conference | 2010 15th Conference on Technologies and Applications of Artificial Intelligence, TAAI 2010 |
|---|---|
| Place | Taiwan, China |
| City | Hsinchu |
| Period | 18/11/10 → 20/11/10 |
Research Keywords
- Convergence
- Learning
- MLP
- Weight noise
Fingerprint
Dive into the research topics of 'Convergence analysis of multiplicative weight noise injection during training'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver