On the selection of weight decay parameter for faulty networks

Chi Sing Leung, Hong-Jiang Wang, John Sum

Research output: Journal Publications and ReviewsRGC 22 - Publication in policy or professional journal

33 Citations (Scopus)

Abstract

The weight-decay technique is an effective approach to handle overfitting and weight fault. For fault-free networks, without an appropriate value of decay parameter, the trained network is either overfitted or underfitted. However, many existing results on the selection of decay parameter focus on fault-free networks only. It is well known that the weight-decay method can also suppress the effect of weight fault. For the faulty case, using a test set to select the decay parameter is not practice because there are huge number of possible faulty networks for a trained network. This paper develops two mean prediction error (MPE) formulae for predicting the performance of faulty radial basis function (RBF) networks. Two fault models, multiplicative weight noise and open weight fault, are considered. Our MPE formulae involve the training error and trained weights only. Besides, in our method, we do not need to generate a huge number of faulty networks to measure the test error for the fault situation. The MPE formulae allow us to select appropriate values of decay parameter for faulty networks. Our experiments showed that, although there are small differences between the true test errors (from the test set) and the MPE values, the MPE formulae can accurately locate the appropriate value of the decay parameter for minimizing the true test error of faulty networks. © 2006 IEEE.
Original languageEnglish
Article number5497173
Pages (from-to)1232-1244
JournalIEEE Transactions on Neural Networks
Volume21
Issue number8
DOIs
Publication statusPublished - Aug 2010

Research Keywords

  • Faulty network
  • generalization error
  • mean prediction error
  • regularization
  • weight decay

Fingerprint

Dive into the research topics of 'On the selection of weight decay parameter for faulty networks'. Together they form a unique fingerprint.

Cite this