VSS SPU-EBP: Variable step size sequential partial update error back propagation algorithm
In MLP networks with hundreds of thousands of weights which must be trained on millions of samples, the time and space complexity may become greatly large and sometimes the training of network by EBP algorithm may be impractical.
Sequential Partial Updating is an effective method to reduce computational load and power consumption in implementation. This new approach is very useful for the MLP networks with large number of weights in each layer that updating of each weight in each round of execution of EBP algorithm will be costly. Although this idea reduces computational cost and elapsed CPU time in each round but sometimes maybe increases number of epochs required to convergence and this leads to increase time of convergence. That is, to speed up more the convergence rate of the SPU?EBP algorithm, we propose a Variable Step Size (VSS) approach. In VSS SPU?EBP algorithm, we use a gradient based learning rate in SPU-EBP algorithm to speed up the convergence of training algorithm. In this method we derive an upper bound for the step size of SPU_EBP algorithm.
Keywords: Neural Network, Error Back Propagation, MLP (Multi-Layered Perceptron), Sequential Partial Update Algorithm.
D. E Rumelhart and J. L. McClelland, 1986, “Parallel distributed Processing”, Cambridge, MA, MIT Press.
L.W. Chan and F. Fallside, 1987, "An adaptive training algorithm for back propagation networks", Computer Speech and Language, Vol. 2, pp. 205-218.
R. Jacobs, 1988, “Increased rates of convergence through learning rate adaptation” , Neural Networks, 1:295-307.
D. E. Rumelhart, G. E. Hinton, R. J. Williams, 1986, “Parrallel Distributed Processing Exploration of the Micro-Structure of Cognition”. Cambridge, MA: MIT Press.
P. Wolfe, 1969, “Convergence conditions for ascent methods”, SIAM Mathematical Review 11 pp. 226–235.
H.Leung and V.Zue, 1990, “Phonetic classification using multilayer perceptions,” in ICASSP.
T. Masters, 1993, “Practical Neural Network Recipes in C++”, Academic Press, Inc.
Hampshire, A. Waibel, 1990, “A Novel Objective Function for Improved Phoneme Recognition Using Time-Delay Neural Networks”, IEEE, Transactions on Neural Networks, June, Vol. 1, No. 2.
B. A. Telfer, H. H. Szu, 1994 “Energy functions for minimizing misclassification error with minimum complexity networks”, Neural Networks, vol. 7, pp. 809-818.
Crane, R.; Fefferman, C.; Markel, S.; and Pearson, J. 1995, “Characterizing neural network error surfaces with a sequential quadratic programming algorithm”, In Machines That Learn.
Caruana, R., 1993, “Generalization vs. Net Size”, Neural Information Processing Systems, Tutorial, Denver, CO.
Mahesh Godavarti And, Mahesh Godavarti , Alfred O. Hero Iii, 2011, “Stability Analysis Of The Sequential Partial Update Lms Algorithm”, Proc. IEEE Int. Conf. Acoust., Speech, and Sig. Proc, pp. 3857—3860.
Douglas, S.C1997, 1997, “Adaptive filters employing partial updates”. IEEE Trans. on Circuits and Systems II, 44(3):209–216.
Maryam Rahmani nia, Ali Amiri, Mahmood Fathy, 2010, “Partial Update Error Back propagation Algorithm,” 15th Iranian Computer society Conference, pp.287-294, Tehran, Iran.
R. A. Fisher, 1936, “The Use of Multiple Measurements in Taxo- nomic Problems,” Annual Eugenics, Vol. 7, No. 2, pp. 179-188.
F.H.F. Leung, H.K. Lam, S.H. Ling, and P.K.S. Tam, 2013 “Tuning of the structure and parameters of neural network using an improved genetic algorithm,” IEEE Trans. Neural Networks, vol. 14, no. 1, pp. 79-88, Jan.
Y. F. Yam and T. W. S. Chow, 2000, “A Weight Initialization Method for Improving Training Speed in Feed forward Neural Network,” Neurocomputing, Vol. 30, No. 1-4, pp. 219-232.
Werbos, P., Beyond Regression: 1974, “New Tools for Prediction and Analysis in the Behavioral Sciences”, PhD thesis, Harvard University.
V.Vapnik, 1995, “The Nature of Statistical Learning Theory”, Chapter 5. Springer-Verlag, New York.
Krose, B. and van der Smagt, P., eds, 1993, “An Introduction to Neural Networks”, fifth edn, University of Amsterdam.
H.Bourlard and N.Morgan, 1994, “Connectionist Speech Recognition: A Hybrid Approach”, lower Academic Publishers.