Acceleration Strategies For The Backpropagation Neural Network Learning Algorithm
Loading...
Date
2001-06
Authors
Zainuddin, Zarita
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The backpropagation algorithm has proven to be one of the most successful neural
network learning algorithms. However, as with many gradient based optimization
methods, it converges slowly and it scales up poorly as tasks become larger and more
complex.
In this thesis, factors that govern the learning speed of the backpropagation algorithm
are investigated and mathematically analyzed in order to develop strategies to improve
the performance of this neural network learning algorithm. These factors include the
choice of initial weights, the choice of activation function and target values, and the two
backpropagation parameters, the learning rate and the momentum factor.
For the choice of initial weights, a weight initialization procedure is developed to
determine a feasiblt? initial point from which a search direction can be computed.
Theoretical analysis of the momentum factor leads to the development of a new method,
the Dynamic Momentum Factor, which dynamically adjusts the momentum factor to
adapt itself locally to the cost function landscape. Similarly, theoretical analysis of the
learning rate parameter provides important insights into the - development of two
learning rate related methods, namely, Dynamic Learning Rate Methods 1 and 2.
Extensive computer simulations and performance comparisons with the conventional backpropagation algorithm and two other gradient based methods, the Conjugate
Gradient and Steepest descent methods have demonstrated the fast convergence of the
proposed methods. The proposed methods have been implemented and tested on
I several benchmark problems and the results indicate that these methods are able to
!~£~ provide enhanced convergence and accelerated training. Explicit computations to
C. determine the optimal momentum factor and learning rate values are not needed and no l- ~"----
heavy computational and storage burden is necessary.
The human face recognition problem is chosen to demonstrate the effectiveness of the
proposed methods on a real world application problem. In addition, the capabilities of
the trained networks on generalization and rejection are investigated. The effect of
varying the number of hidden nodes on the network's performance and the performance
of the network on noisy images is also examined. Numerical evidence shows that the
proposed weight initi~ization and acceleration methods are robust with good average
performance in terms of convergence, generalization and rejection capabilities and
recognition of noisy images.
Description
Keywords
Mathematics , Backpropagation Algorithm