It has been a customary practice for many years to utilize universal approximators such as neural networks when attempting to model complex non-linear, multi-variable functions. Industrial application of such technologies has been particularly prevalent in the area of inferential or soft sensor predictors. For example, see Neuroth, M., MacConnell, P., Stronach, F., Vamplew, P. (April 2000): “Improved modeling and control of oil and gas transport facility operations using artificial intelligence.”, Knowledge Based Systems, vol. 13, no. 2, pp. 81-9; and Molga, E. J. van Woezik, B. A. A, Westerterp, K. R.: “Neural networks for modeling of chemical reaction systems with complex kinetics: oxidation of 2-octanol with nitric acid”, Chemical Engineering and Processing, July 2000, vol. 39, no. 4, pp. 323-334. Many industrial processes require quality control of properties that are still expensive if not impossible to measure on-line. Inferential quality estimators have been utilized to predict such qualities from easy to measure process variables, such as temperatures, pressures, etc. Often, the complex interactions within a process (particularly in polymer processes) manifest as complex non-linear relationships between the easy to measure variables and the complex quality parameters.
Historically, conventional neural networks (or other generic non-linear approximators) have been used to represent these complex non-linearities. For example, see Zhang, J., Morris, A. J., Martin, E. B., Kiparissides, C.: “Estimation of impurity and fouling in batch polymerization reactors through application of neural networks”, Computers in Chemical Engineering, Feb. 1999, vol. 23, no. 3, pp. 301-314; and Huafang, N., Hunkeler, D.: “Prediction of copolymer composition drift using artificial neural networks: copolymerization of acrylamide with quaternary ammonium cationic monomers”, Polymer, February 1997, vol. 38, no. 3, pp. 667-675. Historical plant data is used to train the models (i.e., determine the model coefficients), and the objective function for a model is set so as to minimize model error on some arbitrary (but representative) training data set. The algorithms used to train these models focus on model error. Little or no attention is paid to the accuracy of the derivative of the converged function.
This focus on model error (without other considerations) prohibits the use of such paradigms (i.e., conventional neural networks) in closed loop control schemes since the objective of a non-linear model is usually to schedule the gain and lag of the controller. Although jacketing can be used to restrict the models from working in regions of one dimensional extrapolation, the models will be expected to interpolate between operating points. A linear or well behaved non-linear interpolation is therefore required. The gains may not match the actual process exactly but at the very least, the trajectory should be monotonically sympathetic to the general changes in the process gain when moving from one operating point to another.
Work has been undertaken to understand the stability of dynamic conventional neural networks in closed loop control schemes. Kulawski et al. have recently presented an adaptive control technique for non-linear stable plants with unmeasurable states (see Kulawski, G. J., Brydys', M. A.: “Stable adaptive control with recurrent networks”, Automatica, 2000, vol. 36, pp. 5-22). The controller takes the form of a non-linear dynamic model used to compute a feedback linearizing controller. The stability of the scheme is shown theoretically. The Kulawski et al. paper emphasizes the importance of monotonic activation functions in the overall stability of the controller. However, the argument is not extended to the case of inappropriate gain estimation in areas of data sparseness.
Universal approximators (e.g., conventional neural networks) cannot guarantee that the derivatives will be well behaved when interpolating between two points. The very nature of these models means that any result could occur in the prediction of the output by the universal approximator in a region of missing or sparse data between two regions of sufficient data. Provided that the final two points on the trajectory fit, then the path between the points is unimportant. One of the key advantages of the present invention is that it uses a priori knowledge of the process gain trajectory (e.g., monotonic gain, bounded gain, etc.) and constrains the estimator to solutions that possess these properties.
The benefits of including a priori knowledge in the construction of non-linear approximators has been cited in many areas. Lindskog et al. discuss the monotonic constraining of fuzzy model structures and applies such an approach to the control of a water heating system (see Lindskog, P, Ljung, L.: “Ensuring monotonic gain characteristics in estimated models by fuzzy model structures”, Automatica, 2000, vol. 36, pp. 311-317). Yaser, S. Abu-Mostafa discusses one method of “tempting” a neural network to have localized monotonic characteristics by “inventing” pseudo-training data that possesses the desired non-linear characteristics (see Yaser, S. Abu-Mostafa: “Machines that learn from hints”, Scientific American, April 1995, pp. 64 -69). This does not guarantee global adherence to this particular input/output relationship.
Thus, it is well accepted that universal approximators should not be used in extrapolating regions of data. Since they are capable of modeling any non-linearity then any result could occur in regions outside and including the limits of the training data range.
For process control, the constraining of the behavior of an empirical non-linear model (within its input domain) is essential for successful exploitation of non-linear advanced control. Universal approximators, such as conventional neural networks cannot be used in advanced control schemes for gain scheduling without seriously deteriorating the potential control performance.