This invention is concerned with robust processing and robust adaptive processing by artificial neural systems (NSs) to avoid unacceptable or disastrous processing performances. The invention disclosed herein is applicable in a large number of fields including pattern recognition, signal/speech processing, system identification/control, communication, robotics, biomedical electronics, mechanical design, sound/vibration cancellation, economics, geophysics, sonar/radar data processing, oceanography, time series prediction, financial market forecast, etc. An artificial NS is hereinafter referred to as an NS.
One of the major concentrated activities of the past 15 years in the conventional control theory has been the development of the so-called "H.sup..infin. -optimal control theory," which addresses the issue of worst-case controller design for linear plants subject to unknown additive disturbances and plant uncertainties. Many references can be found in B. A. Francis, A Course in H.sup..infin. Control Theory, Springer-Verlag, New York (1987); and T. Basar and P. Bernhard, H.sup..infin. -Optimal Control and Related Minimax Design Problems: A Dynamic Game Approach, 2nd Edition, Birkhauser, Boston, Mass. (1995). Although the idea of worst-case design is a little too conservative, "H.sup..infin. -optimal" has become a synonym to the word "robust" in the control theory community.
Among many interpretations of and alternative approaches to the H.sup..infin. -optimality for robust linear control and filtering are those based on the minimax criteria in dynamic games and the risk-sensitive (or exponential cost) criteria. The risk-sensitive criteria were first proposed in D. H. Jacobson, "Optimal Stochastic Linear Systems with Exponential Performance Criteria and Their Relation to Deterministic Games," IEEE Transactions on Automatic Control, AC-18-2, pp.124-131 (1973) for optimal stochastic linear control. The relationships among the H.sup..infin., the minimax criteria in dynamic games and the risk-sensitive criteria have attracted a great deal of attention in the past few years. Some well-known references are K. Glover and J. C. Doyle "State-Space Formulae for All Stabilizing Controllers That Satisfy an H.sup..infin. Norm Bound and Relations to Risk-Sensitivity," Systems Control Letters, vol. 11, pp. 167-172, (1988); P. Whittle, Risk Sensitive Optimal Control, Wiley, New York (1990); Jason L. Speyer and Chih-Hai Fan and Ravi N. Banavar, "Optimal Stochastic Estimation with Exponential Cost Criteria," Proceedings of the 31st Conference on Decision and Control, IEEE, New York, N.Y. (1992); T. Basar and P. Bernhard, H.sup..infin. -Optimal Control and Related Minimax Design Problems: A Dynamic Game Approach, 2nd Edition, Birkhauser, Boston, Mass. (1995); B. Hassibi and A. H. Sayed and T. Kailath, "H.sup..infin. -Optimality of the LMS Algorithm, IEEE Transactions on Signal Processing, vol. 44, pp. 267-280, (1996). For linear systems, H.sup..infin. -optimal controllers and filters can be derived by minimizing some risk-sensitive criteria. Extending these robust processing results to nonlinear problems by the conventional analytic approach is a topic of current research; e.g., W. H. Fleming and W. M. McEneaney, Risk Sensitive Optimal Control and Differential Games, Stochastic Theory and Adaptive Control, pp.185-197, vol. 184 of Lecture Notes in Control and Information Sciences, Springer Verlag, Berlin (1992); M. R. James and J. S. Baras and R. J. Elliott, "Risk Sensitive Control and Dynamic Games for Partially Observed Discrete-Time Nonlinear Systems," IEEE Transactions on Automatic Control, AC-39(4), pp. 780-792 (1994); John S. Baras and N. S. Patel, "Information State for Robust Control of Set-Value Discrete Time Systems," Proceedings of the 34th Conference on Decision and Control, pp. 2302-2307, New Orleans, La. (1995); and W. Lin and C. I. Byrnes, "H.sub..infin. -Control of Discrete-Time Nonlinear System," IEEE Transactions on Automatic Control, vol. 41, No. 4 (1996). However, certain structures of the mathematical models involved are assumed in these papers, and no systematic conventional method of designing a robust discrete-time processor, that is optimal or near-optimal with respect to a robustness criterion for a general nonlinear operating environment, is available.
Since the neural networks are known to be effective nonlinear processors, let us examine the prior art of neural networks (NNs) for robust processing. There are many good books on NNs and their applications. A good introduction to NNs can be found in R. Hecht-Nielsen, Neurocomputing, Addison-Wesley (1990), J. Hertz, A. Krogh and R. G. Palmer, Introduction to the Theory of Neural Computation, Addison-Wesley (1991), S. Haykin, Neural Networks, Macmillan College Publishing Company (1994), and M. H. Hassoun, Fundamentals of Artificial Neural Networks, MIT Press (1995). Applications of NNs can be found in D. A. White and D. A. Sofge, editors, Handbook of Intelligent Control, Van Nostrand Reinhold (1992), B. Kosko, editor, Neural Networks for Signal Processing, Prentice Hall (1992), D. P. Morgan and C. L. Scofield, Neural Networks and Speech Processing, Kluwer Academic Publishers (1991)), and E. Sanchez-Sinencio and C. Lau, editors, Artificial Neural Networks, IEEE Press (1992). There are also a large number of research articles concerning neural networks, which can be found in journals (e.g., IEEE Transactions on Neural Networks, Neural Networks, and Neural Computation), and in Conference proceedings (e.g., Proceedings of the International Joint Conference on Neural Networks).
Patent documents concerning NNs (neural networks) and their applications are too numerous to list here. Three that seem highly relevant to the present invention are as follows. In U.S. Pat. No.5,003,490 to P. F. Castelaz and D. E. Mills, (1991), a multilayer perceptron with a sigmoid activation function and a tapped delay line for the input is used to classify input waveforms. In U.S. Pat. No. 5,150,323 (1992) to P. F. Castelaz, a multilayer perceptron with a sigmoid activation function and a couple of tapped delay lines for preprocessed inputs is used for in-band separation of a composite signal into its constituent signals. In U.S. Pat. No. 5,408,424 (1995) to James T. Lo, recurrent neural networks are used for optimal filtering.
A neural system (NS) comprising an NN and at least one range transformer is disclosed in U.S. Pat. No. 5,649,065 (1997) to James T. Lo for optimal filtering when the range of the exogenous input process or outward output process of the NS is necessarily large and/or keeps expanding during the operation of the NS.
So far, the main concern in synthesizing an NS, whether it comprises a neural network or a neural network and at least one range transformer, has been a good overall processing performance. However, a good overall processing performance may be accompanied with disastrous or unacceptable processing performances on some individual runs of the NS. The issue of robustness for multilayer perceptrons is considered in B. Hassibi and T. Kailath, "H.sup..infin. Optimal Training Algorithms and their Relation to Backpropagation," Advances in Neural Information Processing Systems, vol. 7, pp. 191-199, edited by G. Tesauro, D. S. Touretzky and T. K. Lee, MIT Press, Cambridge, Mass. (1995). Global H.sup..infin. optimal training algorithms for multilayer perceptrons are derived therein. Unfortunately, the ensuing estimators of the weights of a multilayer perceptron under training are infinite-dimensional, requiring growing memory. Upon a specialization, they reduce to a finite-dimensional, but only locally H.sup..infin. optimal estimator, which is the well-known backpropagation algorithm. The local H.sup..infin. optimality of the backpropagation means that it "minimizes the energy gain from the disturbances to the prediction errors, only if the initial condition is close enough to the true weight vector and if the disturbances are small enough." Besides these results on multilayer perceptrons, the issue of robustness has not been considered for neural networks in the open literature.
In summary, a systematic method, conventional or neural-network, of designing a robust processor, that is optimal or near-optimal with respect to a robustness criterion for a general nonlinear operating environment is greatly desired.