This invention is concerned with training neural systems and estimating regression models. The invention is applicable to neural systems comprising either a neural network or a neural network and a preprocessor and/or a postprocessor and to regression models that are either mathematical functions or dynamical systems, whose input and output variables are either continuous or discrete or both. Neural systems, neural networks and regression models are hereinafter referred to as NSs, NNs and RMs respectively. “Estimating regression models” means estimating regression coefficients of regression models, which is also called “training regression models” and “estimating regression models.” Neural systems and neural networks herein considered are artificial neural systems and artificial neural networks respectively. Neural networks and nonlinear regression models have been applied to control, communication, robotics, geophysics, sonar, radar, economics, financial markets, signal/speech/image processing, etc.
Neural networks are trained and nonlinear regression models are estimated usually through the minimization of an error criterion. The error criterion used is usually a nonconvex function of weights of the neural network under training or regression coefficients of the nonlinear regression model under estimation. The nonconvexity of the error criterion may cause a local-search optimization procedure to produce a poor local minimizer of the error criterion. Since a good global optimization method does not exist in the prior art and a local-search optimization method is usually used, avoiding poor local minima of the nonconvex error criterion has been a major concern with training neural networks and estimating nonlinear regression models.
A common practice for avoiding poor local minima of the error criterion is to repeat a local-search optimization procedure a certain number of times with different initial guesses of the weights or regression coefficients and selecting the neural network or nonlinear regression model that has a smallest value of the error criterion. This involves a large amount of computation, and the selected neural network or nonlinear regression model may still be far from being optimal with respect to the error criterion. Therefore, a method of training neural networks and estimating nonlinear regression models that is able to avoid poor local minima is highly desirable.
In a recent U.S. Pat. No. 5,987,444 entitled “Robust Neural Systems” granted 16 Nov. 1999, a robust neural system for robust processing was disclosed for averting unacceptably large or disastrous processing errors. The training methods described in U.S. Pat. No. 5,987,444 are often numerically infeasible, especially if the value of the risk-sensitivity index of the risk-averting error criterion used is large. A method of training neural networks and estimating nonlinear regression models into robust neural systems and regression models, that is numerically feasible and effective, is also highly desirable.
A new method of training neural networks and estimating nonlinear regression models and its variants are herein disclosed, that have the ability to avoid poor local minima and/or produce robust neural systems.