z-logo
open-access-imgOpen Access
A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning
Author(s) -
Martin Möller
Publication year - 1990
Publication title -
daimi pb
Language(s) - English
Resource type - Journals
eISSN - 2245-9316
pISSN - 0105-8517
DOI - 10.7146/dpb.v19i339.6570
Subject(s) - broyden–fletcher–goldfarb–shanno algorithm , conjugate gradient method , backpropagation , algorithm , artificial neural network , nonlinear conjugate gradient method , convergence (economics) , gradient method , mathematics , computer science , gradient descent , mathematical optimization , artificial intelligence , telecommunications , asynchronous communication , economics , economic growth
A supervised learning algorithm (Scaled Conjugate Gradient, SCG) with superlinear convergence rate is introduced. The algorithm is based upon a class of optimization techniques well known in numerical analysis as the Conjugate Gradient Methods. SCG uses second order information from the neural network but requires only O(N) memory usage, where N is the number of weights in the network. The performance of SCG is benchmarked against the performance of the standard backpropagation algorithm (BP), the conjugate gradient backpropagation (CGB) and the one-step Broyden-Fletcher-Goldfarb-Shanno memoryless quasi-Newton algorithm (BFGS). SCG yields a speed-up of at least an order of magnitude relative to BP. The speed-up depends on the convergence criterion, i.e., the bigger demand for reduction in error the bigger the speed-up. SCG is fully automated including no user dependent parameters and avoids a time consuming line-search, which CGB and BFGS use in each iteration in order to determine an appropriate step size.   Incorporating problem dependent structural information in the architecture of a neural network often lowers the overall complexity. The smaller the complexity of the neural network relative to the problem domain, the bigger the possibility that the weight space contains long ravines characterized by sharp curvature. While BP is inefficient on these ravine phenomena, it is shown that SCG handles them effectively.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here