Acceleration Strategies For The Backpropagation Neural Network Learning Algorithm

Date

2001-06

Authors

Zainuddin, Zarita

Abstract

The backpropagation algorithm has proven to be one of the most successful neural network learning algorithms. However, as with many gradient based optimization methods, it converges slowly and it scales up poorly as tasks become larger and more complex. In this thesis, factors that govern the learning speed of the backpropagation algorithm are investigated and mathematically analyzed in order to develop strategies to improve the performance of this neural network learning algorithm. These factors include the choice of initial weights, the choice of activation function and target values, and the two backpropagation parameters, the learning rate and the momentum factor. For the choice of initial weights, a weight initialization procedure is developed to determine a feasiblt? initial point from which a search direction can be computed. Theoretical analysis of the momentum factor leads to the development of a new method, the Dynamic Momentum Factor, which dynamically adjusts the momentum factor to adapt itself locally to the cost function landscape. Similarly, theoretical analysis of the learning rate parameter provides important insights into the - development of two learning rate related methods, namely, Dynamic Learning Rate Methods 1 and 2. Extensive computer simulations and performance comparisons with the conventional backpropagation algorithm and two other gradient based methods, the Conjugate Gradient and Steepest descent methods have demonstrated the fast convergence of the proposed methods. The proposed methods have been implemented and tested on I several benchmark problems and the results indicate that these methods are able to !~£~ provide enhanced convergence and accelerated training. Explicit computations to C. determine the optimal momentum factor and learning rate values are not needed and no l- ~"---- heavy computational and storage burden is necessary. The human face recognition problem is chosen to demonstrate the effectiveness of the proposed methods on a real world application problem. In addition, the capabilities of the trained networks on generalization and rejection are investigated. The effect of varying the number of hidden nodes on the network's performance and the performance of the network on noisy images is also examined. Numerical evidence shows that the proposed weight initi~ization and acceleration methods are robust with good average performance in terms of convergence, generalization and rejection capabilities and recognition of noisy images.

Keywords

Mathematics , Backpropagation Algorithm

URI

http://hdl.handle.net/123456789/2590

Collections

Pusat Pengajian Sains Matematik - Tesis

Full item page