1. Field of the Invention
The present invention relates generally to a process for the neural modeling of dynamic processes having very different time constants.
2. Description of the Related Art
Neural networks are employed in the most diverse of technical fields. Neural networks have proven to be particularly suitable anywhere where decisions are to be derived from complex technical relationships and from insufficient information. To form one or more output variables, for example, one or more input variables are fed to the neural network. For this purpose, such a network is initially trained for the specific use and is subsequently generalized and is then validated using a data set other than the training data. Neural networks prove to be particularly suitable for many uses, since they are universally trainable.
A problem which often occurs in conjunction with the use of neural networks is, however, that the number of inputs of the neural network is often too large and hence the network appears to be unnecessarily complex for the application. In particular, the excessively large neural network does not reach the required performance on the generalization data during training. It then often learns application examples by heart rather than learning the problem structure. In practice, it is therefore desirable to limit the number of possible input variables as far as possible to those that are necessary, that is to say to the number of input variables which have the greatest effects on the output variables that are to be determined. The problem may also arise in practice that a neural network is intended to be supplied with input variables which arise at different times, some of which are located hours or days apart. For such eventualities, for example, recurrent neural networks are used. These networks contain feedback paths between the neurons internally and it is thus possible for them to construct a type of memory about the input variables which have arisen. However, because of the simpler handling, in particular because they are more easily trainable, forwardly directed neural networks often appear desirable in the field of use.
In the case of industrial processes, in particular in the case of biochemical processes, different partial processes having very different time constants often interact. Chemical reactions often take place in a fraction of a second. During the degradation or synthesis of materials by microorganisms and the growth and death of bacteria or fungi, time constants of hours or days often occur. Time constants in the range of hours and days occur in particular in systems in which there are material circulations having feedback and intermediate storage. Separate treatment of the partial processes, which progress at different speeds, is often not possible. Thus, for example, there is a close coupling between the individual processes proceeding in the purification of sewage. In addition, measured values "between" the individual processes can be obtained only at very high cost, if at all, as a precondition for separate neural modeling. This is true in particular in the case of the biochemical processes which proceed in sewage treatment in sewage treatment plants.
A suitable representation of the input data, in conjunction with the selection of relevant process information by means of the neural network, is the precondition for being able to model the simultaneous action of different time constants neurally.
In order to be able to model very rapidly progressing partial processes neurally, on the one hand it is necessary for data to be obtained at a very high sampling rate and to be applied to the neural network as input variables. On the other hand, for modeling the slowly progressing processes, the data over a range reaching appropriately far back into the past are to be applied as input values to a forwardly directed network. This method of proceeding has the disadvantage that the neural network has a large number of inputs even with just a small number of measured variables and hence has a large quantity of adaptable parameters. As a result of this high number of free parameters, the network has a complexity which is higher than is appropriate for the data, and tends to "overfitting", see the references Hergert, F., Finnoff, W., Zimmermann, H. G.: "A comparison of weight elimination methods for reducing complexity in neural networks", in Proceedings of the Int. Joint Conf. on neural networks, Baltimore, 1992, and Hergert, F., Finnoff, W., Zimmermann, H. G.: "Evaluation of Pruning Techniques", ESPRIT Projekt Bericht [Project Report] 5293 - Galatea, Doc. No.: S23.M12.-August 1992. Thus, in the case of the data points used for training, the neural network indeed reaches a very good approximation to the data. In the case of the "generalization data" not used for training, networks having too many adaptable parameters exhibit poor performance. An alternative possibility for the neural modeling of processes having very different time constants are recurrent neural networks (RNN). Because of the feedback which is realized in the network, RNN are capable of storing information from previous data and thus of modeling processes having long time constants or having feedback. The disadvantage of RNN is that simple learning processes such as, for example, back propagation, can no longer be used and, instead, specific learning processes such as, for example, Real Time Recurrent Learning (RTRL) or Back Propagation Through Time (BPTT) must be employed. Especially in the case of a high number of data points, RNN are difficult to train and tend to numeric instability, see the reference Sterzing, V., Schirmann, B.: "Recurrent Neural Networks for Temporal Learning of Time Series", in Proceedings of the 1993 Internation Conference on Neural Networks, March 27-31, San Francisco 843-850.
The international potent publication WO-A-94/17489 discloses a Back Propagation network which defines preprocessing parameters and time delays in the training mode. In the operating mode, the operating data are then processed together with the preprocessing parameters and, together with the defined time delays, are fed into the system as input data. Such a network is particularly suitable for applications in which the input data are based on different time scales.
The journal article IEEE EXPERT, Volume 8, No. 2, Apr. 1, 1993, pages 44 to 53, by J. A. Leonard and M. A. Kramer: "Diagnosing dynamic faults using modular neural nets" discloses general possibilities of the diagnosis of dynamic errors in modular networks. In that publication, for example, the time aspect of input data is taken into account in that each time a new input data set is added, the oldest data set is dispensed with and the system then carries out a new calculation. By this means, the number of input data sets to be taken into account by the system in each case is kept within limits. In the prior art cited, therefore, limited possibilities are indicated for taking into account the past of the system. Furthermore, their different techniques are extensively explained. No further relevant prior art is known.
An object on which the invention is based is to provide an arrangement and a process with which the number of input variables arising over time of a neural network can be reduced in a practical manner. It is intended, in particular, by means of the inventive process and the inventive arrangement to realize a memory, on remembrance for forwardly directed neural networks. Furthermore, it is intended that the inventive process shall meet the special requirements which underlie chemical processes having different time constants.
This and other objects and advantages are achieved in a process for conditioning an input variable of a neural network,
a) in which a time series is formed from a set of values of the input variable by determining the input variable at discrete times, PA1 b) in which, from the time series, at least a first interval is delimited in such a way that the length of the interval is selected to be greater the further the values therein lie back in the past, PA1 c) in which the first partial time series delimited by the first interval is convoluted with an essentially bell-shaped function to form an average and the first maximum value of the convolution result is formed, PA1 d) and in which the neural network is fed the first maximum value which is representative of the first partial time series essentially simultaneously with another input value which is located within the time series but outside the first interval. PA1 a) in which at least one sensor is provided for acquiring a first measured value, PA1 b) in which at least one memory is provided for storing measured values, PA1 c) in which a processor is provided for measured values stored in the memory, PA1 d) in which a forwardly directed neural network is provided, PA1 e) and in which at least one effector is provided for influencing the chemical process, PA1 f) and in which there is stored in the memory a time series, formed from a set of measured values measured at regular time intervals, which series is fed to the preprocessor, the latter combining a plurality of measured values in one interval in such a way that the length of the interval is selected to be greater the further the measured values located therein lie back in the past, and deriving a single value therefrom by means of convolution with a bell-shaped curve, and feeding this value, together with a further value from the time series stored in the memory, to the neural network, from which values the latter forms a manipulated variable for the chemical process, and in which control arrangement this manipulated variable is forwarded to the effector.
A control arrangement is also provided by the present invention, for example for a chemical process,
Additional developments of the invention provide that the process as described above includes a plurality of interval groups being delimited, in each case at least two successive intervals of such an interval group having the same length, and in which a process analogous to that above is followed for the determination of the respective interval length of an interval group.
The process preferably provides that the interval limits of two adjacent intervals are selected such that the intervals have at least one value from the time series in common. The width of the bell-shaped curve has essentially the same length as the respective interval to be convoluted therewith. A Gaussian distribution is selected as the bell-shaped curve of an exemplary embodiment. A standard deviation of the magnitude of the interval length is used. The input values are derived from a system which has at least one known time constant, and in which at least one interval length of the magnitude of the time constant, or a multiple thereof, is selected. The input values are fed to a forwardly directed neural network. The values of the series are normalized to a fixed interval.
The present process is used both during training and during operation of the neural network. In a first step, unconvoluted values from the time series are fed to the neural network for training and, in a second step, said network is trained with the input variables which result from the convolution of values from the time series from the past.
At least one time series is formed from measured values from a sewage treatment plant.
A particular advantage of the process according to the invention is that it allows process data to be taken into account over a very long period of time, in that a large measured value interval is used for measured values which lie back over a relatively long time period, and a sensible average is obtained from this interval by means of a convolution. A suitable selection of the convolution curve and the interval size thus makes it possible to introduce a specific damping, or attenuation, over time into the consideration of the individual input values, in that, for example, measured values which lie further back are convoluted over a longer interval and using a flatter bell curve.
In the process according to the invention, a different interval size can advantageously be introduced for the purpose of accurately taking into account the time-dependent input variables, the interval becoming ever greater in the direction of the past, it also being advantageously possible, for example, to form different interval groups which have the same length.
In the case of the process according to the invention, provision is advantageously made for selecting the intervals in such a way that they mutually overlap, since thus each value in the time series is used for determining the overall input variables of the neural network. The weighting of the individual values of the time series in this case becomes, for example, more homogeneous the greater the overlapping range of the individual intervals.
Preferably, the width of the bell curve with which the time series is convoluted within the interval is selected such that it corresponds in magnitude to the length of the interval, because in this way a sensible convolution result is obtained which, in practice, corresponds to the values acquired within the interval.
Preferably, a Gaussian bell curve is selected as convolution curve, since this curve approximates statistical processes in nature and thus appears to be particularly accurately adapted to the values within the time series. For this purpose, the standard deviation is in particular selected to be as wide as the length of the interval.
Preferably, in the process according to the invention, provision is made for forming the input data of the neural network as a time series of a process which has different time constants, and to take these time constants into account in dividing up the intervals of the time series, to be specific in such a form that the interval length corresponds to multiples of this time constant or of the several constants. It is thus ensured that the time behavior of the process to be monitored or controlled can be particularly accurately approximated by the process according to the invention.
Preferably, the process according to the invention is carried out using a forwardly directed neural network, since these have no memory and, by means of the method according to the invention, the past is taken into account in a manner in which it can replace a memory for these networks.
Preferably, the input variables of the neural network are normalized to one range, in order to make more rapid and better training of the network possible.
Preferably, the process according to the invention is used both during training of the network and in the operation of the same, since only in this way is it ensured that the learned behavior also produces the results which are desired in practice.
The process according to the invention is particularly advantageously used to control a chemical process which has different time constants, in particular the input data can in this case be obtained from a sewage treatment plant, since there time constants of up to 72 hours, in addition to rapidly changing time constants, often play a part.
Particularly advantageous is a device according to the invention which has a measuring sensor, a memory for storing the measured values of this sensor, a preprocessor for conditioning the measured values, for example in accordance with the method according to the invention, and a neural network which evaluates the input data which has been generated and obtains from them a value which is fed to the effector in order to influence the chemical process. In this way, a particularly simple control arrangement is realized.