1. Field of the Invention
The invention relates to an arrangement of computation elements which are connected to one another to form a computer system, a method for computer-aided determination of a dynamic response on which a dynamic process is based, and a method for computer-aided training of an arrangement of computation elements which are connected to one another.
2. Description of the Related Art
Pages 732-789 of Neural Networks: a Comprehensive Foundation, Second Edition, by S. Hayken, published by MacMillan College Publishing Company in 1999 describes the use of an arrangement of computation elements which are connected to one another for determining a dynamic response on which a dynamic process is based.
In general, a dynamic process is normally described by a state transition description, which is not visible to an observer of the dynamic process, and an output equation, which describes observable variables of the technical dynamic process. One such structure is shown in FIG. 2.
A dynamic system 200 is subject to the influence of an external input variable u whose dimension can be predetermined, with an input variable ut at a time t being annotated ut:
utxcex5l,
where l denotes a natural number.
The input variable ut at a time t causes a change in the dynamic process taking place in the dynamic system 200.
An inner state st (stxcex5m) whose dimension m at a time t can be predetermined, cannot be observed by an observer of the dynamic system 200.
Depending on the inner state st and the input variable ut, a state transition is caused in the inner state st of the dynamic process, and the state of the dynamic process changes to a subsequent state st+1 at a subsequent time t+1.
In this case:
st+1=f(st, ut).xe2x80x83xe2x80x83(1)
where f( ) denotes a general mapping rule.
An output variable yt to time t, which can be observed by an observer of the dynamic system 200, depends on the input variable ut and on the inner state st.
The output variable Yt (Ytxcex5z,1n) has a dimension n which can be predetermined.
The dependency of the output variable yt on the input variable ut and on the inner state st of the dynamic process is expressed by the following general rule:
yt=g(st, ut),xe2x80x83xe2x80x83(2)
where g(.) denotes a general mapping rule.
In order to describe the dynamic system 200, S. Hayken describes using an arrangement of computation elements, which are connected to one another, in the form of a neural network of neurons which are connected to one another. The connections between the neurons in the neural network are weighted. The weights in the neural network are combined in a parameter vector v.
An inner state of a dynamic system which is subject to a dynamic process is thus, in accordance with the following rule, dependent on the input variable ut and the inner state at the previous time st, and the parameter vector v:
xe2x80x83st+1=NN(v, st, ut),xe2x80x83xe2x80x83(3)
where NN( ) denotes a mapping rule which is predetermined by the neural network.
An arrangement of computation elements which is referred to as a Time Delay Recurrent Neural Network (TDRNN) is described in David E. Rumelhart et al., Parallel Distributed Processing, Explorations in the Microstructure of Cognition, Vol. 1: Foundations, A Bradford Book, The MIT Press, Cambridge, Mass., London, England, 1987. The known TDRNN is illustrated in FIG. 5 as a neural network 500 which is convoluted over a finite number of times (the illustration shows 5xc2x7times, txe2x88x924, txe2x88x923, txe2x88x922, txe2x88x921, t).
The neural network 500 which is illustrated in FIG. 5 has an input layer 501 with five partial input layers 521, 522, 523, 524 and 525, which each contain a number (which can be predetermined) of input computation elements, to which input variables utxe2x88x924, utxe2x88x923, utxe2x88x922, utxe2x88x921 and ut can be applied at times txe2x88x924, txe2x88x923, txe2x88x922, txe2x88x921 and t which can be predetermined, that is to say time series values, which are described in the following text, with predetermined time steps.
Input computation elements, that is to say input neurons, are connected via variable connections to neurons in a number (which can be predetermined) of concealed layers 505 (the illustration shows 5 concealed layers). In this case, neurons in a first 531, a second 532, a third 533, a fourth 534 and a fifth 535 concealed layer are respectively connected to neurons in the first 521, the second 522, the third 523, the fourth 524 and the fifth 525 partial input layer.
The connections between the first 531, the second 532, the third 533, the fourth 534 and the fifth 535 concealed layer and, respectively, the first 521, the second 522, the third 523, the fourth 524 and the fifth 525 partial input layers are each the same. The weights of all the connections are each contained in a first connection matrix B1.
Furthermore, the neurons in the first concealed layer 531 are connected from their outputs to inputs of neurons in the second concealed layer 532, in accordance with a structure which is governed by a second connection matrix A1. The neurons in the second concealed layer 532 are connected by their outputs to inputs of neurons in the third concealed layer 533 in accordance with a structure which is governed by the second connection matrix A1. The neurons in the third concealed layer 533 are connected by their outputs to inputs of neurons in the fourth concealed layer 534 in accordance with a structure which is governed by the second connection matrix A1. The neurons in the fourth concealed layer 534 are connected by their outputs to inputs of neurons in the fifth concealed layer 535 in accordance with a structure which is governed by the second connection matrix A1.
Respective xe2x80x9cinnerxe2x80x9d states or xe2x80x9cinnerxe2x80x9d system states stxe2x88x924, stxe2x88x923, stxe2x88x922, stxe2x88x921, and st of a dynamic process which is described by the TDRNN are represented at five successive times txe2x88x924, txe2x88x923, txe2x88x922, txe2x88x921 and t in the concealed layers, the first concealed layer 531, the second concealed layer 532, the third concealed layer 533, the fourth concealed layer 534 and the fifth concealed layer 535.
The details in the indices in the respective layers each indicate the time txe2x88x924, txe2x88x923, txe2x88x922, txe2x88x921 and t to which the signals (utxe2x88x924, utxe2x88x923, utxe2x88x922, utxe2x88x921, ut) which can in each case be tapped off from or supplied to the outputs of the respective layer relate.
One output layer 520 has five partial output layers, a first partial output layer 541, a second partial output layer 542, a third partial output layer 543, a fourth partial output layer 544 and a fifth output layer 545. Neurons in the first partial output layer 541 are connected to neurons in the first concealed layer 531 in accordance with a structure which is governed by an output connection matrix C1. Neurons in the second partial output layer 542 are likewise connected to neurons in the second concealed layer 532 in accordance with the structure which is governed by the output connection matrix C1. Neurons in the third partial output layer 543 are connected to neurons in the third concealed layer 533 in accordance with the output connection matrix C1. Neurons in the fourth partial output layer 544 are connected to neurons in the fourth concealed layer 534 in accordance with the output connection matrix C1. Neurons in the fifth partial output layer 545 are connected to neurons in the fifth concealed layer 535 in accordance with the output connection matrix C1. The output variables for a respective time txe2x88x924, txe2x88x923, txe2x88x922, txe2x88x921, t can be tapped off (ytxe2x88x924, Ytxe2x88x923, Ytxe2x88x922, ytxe2x88x92, yt) on the neurons in the partial output layers 541, 542, 543, 544 and 545.
The principle that equivalent connection matrices in a neural network have the same values at a respective time is referred to as the principle of shared weights. The arrangement of computation elements which is known from Rumelhart et al. and is referred to as a Time Delay Recurrent Neural Network (TDRNN) is trained in a training phase in such a manner that a target variable ytd relating to an input variable ut is in each case determined on a real dynamic system. The tuple (input variable, determined target variable) is referred to as a training data item. A large number of such training data items form a training data set.
In this case, tuples (utxe2x88x924, Ytxe2x88x924d) (utxe2x88x923, ytxe2x88x923d), (utxe2x88x922, ytxe2x88x922d) which follow one another in time at the times (txe2x88x924, txe2x88x923, txe2x88x923, . . . ) in the training data set each have a predefined time step.
The TDRNN is trained using the trained data set, and S. Hayken provides a summary of the various training methods.
At this point, it should be stressed that only the output variables (Ytxe2x88x924, Ytxe2x88x923, . . . . , Yt) at the times (txe2x88x924, txe2x88x923, . . . , t) can be identified in the dynamic system 200. The xe2x80x9cinnerxe2x80x9d system states (stxe2x88x924, stxe2x88x923, . . . , st) cannot be observed.
The following cost function E is normally minimized in the training phase:                               E          =                                                    1                T                            ⁢                                                ∑                                      t                    =                    1                                    T                                ⁢                                                      (                                                                  y                        t                                            -                                              y                        t                        d                                                              )                                    2                                                      →                          min                              f                ,                g                                                    ,                            (        4        )            
where T denotes a number of times being considered.
Furthermore, pages 3-90 of Neuronale Netze in der xc3x96konomie, Grundlagen and finanzwirtschaftliche Anwendungen, (Neural Networks in Economics, Principles and Financial Applications) by H. Rehkugler and H. G. Zimmermann, published by Verlag Franz Vahlen Munich in 1994, contains a summary of the principles of neural networks and the application options for neural networks in the economics field.
The known systems and methods have the particular disadvantage that they can be used only to describe the current state of a process for an input variable ut at a current time t or for an input variable utxe2x88x921 at a time txe2x88x921 which preceded the present time by a predetermined time step. A future subsequent state of the process which follows after a time step which can be predetermined, or future subsequent states of the process, which each follow one another after a time step which can be predetermined cannot be described or predicted, respectively, in most cases.
The invention is thus based on the problem of specifying an arrangement of computation elements which are connected to one another, by which future subsequent states which follow one another in time in a dynamic process can be described, and which arrangement is not subject to the disadvantages of the known systems.
Furthermore, the invention is based on the problem of specifying a method for computer-aided determination of a dynamic response, on which a dynamic process is based, by which future subsequent states in a dynamic process, which follow one another in time, can be described.
An arrangement of computation elements which are connected to one another according to the invention has the following features. The arrangement includes at least one first subsystem with a first input computation element, to which time series values, which each describe one state of a system in a first state space, can be supplied, and with a first intermediate computation element, by which a state of the system can be described in a second state space, with the first input computation element and the first intermediate computation element being connected to one another. The arrangement includes at least one second subsystem with an associated second intermediate computation element, by which a state of the system can be described in the second state space, and with an associated first output computation element, on which a first output signal, which describes a state of the dynamic system in the first state space, can be tapped off, with the second intermediate computation element and the first output computation element being connected to one another. The arrangement includes at least one third subsystem with an associated third intermediate computation element, by which a state of the system can be described in the second state space, and with an associated second output computation element, on which a second output signal, which describes a state of the dynamic system in the first state space, can be tapped off, with the third intermediate computation element and the second output computation element being connected to one another, the first subsystem, the second subsystem and the third subsystem are each connected to one another by a coupling between the associated intermediate computation elements, weights, which are each associated with one connection between two intermediate computation elements, are equal to one another, and weights which are each associated with a connection between an intermediate computation element and an output computation element are equal to one another.
The following steps are carried out in a method for computer-aided determination of a dynamic response on which a dynamic process is based: a) the dynamic process is described by a time series with time series values in a first state space, with at least one first time series value describing a state of the dynamic process at a first time, and a second time series value describing a state of the dynamic process at a second time, b) the first time series value being transformed to a second state space, c) the first time series value in the second state space being subjected to mapping onto a second time series value in the second state space, d) the second time series value in the second state space being subjected to mapping onto a third time series value in the second state space, e) the second time series value in the second state space and the third time series value in the second state space each being transformed back to the first state space, and f) the dynamic response of the dynamic process being determined using the time series values in the second state space.
In a method for computer-aided determination of a dynamic response on which a dynamic process is based, which method is carried out using an arrangement of computation elements which are connected to one another, with the arrangement having the following construction: if there is at least one first subsystem, an associated first input computation element, to which an input signal which describes a state of the dynamic system in a first state space can be supplied, and an associated first intermediate computation element, by which a state of the dynamic process can be described in a second state space, are connected to one another, if there is at least one second subsystem, an associated second intermediate computation element, by which a state of the dynamic process can be described in the second state space, and an associated first output computation element on which a first output signal, which describes a state of the dynamic process in the first state space, can be tapped off are connected to one another, if there is at least one third subsystem, an associated third intermediate computation element, by which a state of the dynamic process can be described in the second state space, and an associated second output computation element, on which a second output signal which describes a state of the dynamic process in the first state space at a time, can be tapped off are connected to one another, the first subsystem, the second subsystem and the third subsystem are each connected to one another by means of a coupling between the associated intermediate computation elements, weights which are each associated with a connection between two intermediate computation elements are defined in such a manner that the weights are equal to one another, weights which are each associated with a connection between an intermediate computation element and an output computation element are defined in such a manner that the weights are equal to one another, the input signal is supplied to the arrangement. The arrangement determines the first output signal and the second output signal. The dynamic response is determined using the first output signal and the second output signal.
In a method for computer-aided training of an arrangement of computation elements which are connected to one another, having the following components: if there is at least one first subsystem, an associated first input computation element, to which an input signal which describes a state of a system in a first state space can be supplied, and an associated first intermediate computation element, by means of which a state of the system can be described in a second state space, are connected to one another, if there is at least one second subsystem, an associated second intermediate computation element, by means of which a state of the system can be described in the second state space, and an associated first output computation element on which a first output signal, which describes a state of the system in the first state space, can be tapped off are connected to one another, if there is at least one third subsystem, an associated third intermediate computation element, by means of which a state of the system can be described in the second state space, and an associated second output computation element, on which a second output signal which describes a state of the system in the first state space, can be tapped off are connected to one another, the first subsystem, the second subsystem and the third subsystem are each connected to one another by means of a coupling between the associated intermediate computation elements, weights which are each associated with a connection between two intermediate computation elements are defined in such a manner that the weights are equal to one another, weights which are each associated with a connection between an intermediate computation element and an output computation element are defined in such a manner that the weights are equal to one another, the arrangement is trained using predetermined training data which are applied to the first input computation element as the input signal, in such a manner that error values are found only in those subsystems which represents states of the dynamic system whose times each correspond to a time of a training data item.
The arrangement is particularly suitable for carrying out the method according to the invention, or one of their developments explained in the following text.
A number of subsequent states, of a dynamic process, which each follow one another by a time step which can be predetermined, can now be predicted using the invention. This allows states of the dynamic process to be predicted over long time period. Such determination of future states of the dynamic process is referred to as overshooting.
In this way, the invention can be used for carrying out first cause analysis, for determining early warning indicators, and for the purposes of an early warning system. This means that the invention can be used for a dynamic process to determine those indicators or process states which at the present time already indicate undesirable process states which will follow the present time after a long time interval. A first cause of an undesirable development in the dynamic process can thus be identified at an early stage, with a remedial measure thus being initiated.
The invention means, in particular, that states at a previous time are taken into account with less weighting in the determination of the dynamic response on which the dynamic process is based than states which have occurred at a more recent time.
Furthermore, the invention has the advantage that the training of the arrangement according to the invention and the training method according to the invention require less training data than known systems and methods, so that more efficient learning is possible. This is feasible in particular due to a particular selection and arrangement or structure of the computation elements used in the arrangement according to the invention.
A number of first, second and/or third subsystems are preferably used in each case.
One development includes at least one fourth subsystem having an associated fourth intermediate computation element, by means of which a state of the system can be described in the second state space, and having an associated second input computation element. The associated fourth intermediate computation element and the associated second input computation element, to which further time series values, which each describe a further state of the system in the first state space, can be supplied, are connected to one another. The fourth subsystem is coupled to the first subsystem by means of a coupling between the associated fourth intermediate computation element and the first intermediate computation element. In the development, weights which are each associated with a connection between an input computation element and an intermediate computation element are equal to one another.
A number of fourth subsystems are preferably used.
Simple output signals can be tapped off on an output computation element when one output computation element is connected to a number of intermediate computation elements.
In one preferred refinement, the first, the second and the third subsystems respectively represent the system at a first a second and a third time, with the first, the second and the third time each being successive times and there being a first time interval between the first time and the second time, which has a first time step which can be predetermined, and there being a second time interval between the second and the third time, which has a second time step which can be predetermined.
For long-term prognosis, it is advantageous for the first subsystem to represent a current state of the system, for the second subsystem to represent a future first state of the system, shifted through the first time step which can be predetermined, and for the third subsystem to represent a future second state of the system, shifted through the second time step which can be predetermined.
The first time step which can be predetermined and the second time step which can be predetermined are preferably equal to one another.
In a development, the second time step which can be predetermined is a multiple of the first time step which can be predetermined.
In order to determine intermediate states of the system, it is advantageous for the first and/or the second time step(s) to be defined in such a manner that the first and/or the second time step(s) which can be predetermined are/is a devisor of a further predetermined time step, which is governed by a time series which is formed by the time series values.
In one refinement, the fourth subsystem represents the system at a fourth time, with there being a third time interval, which has a third time step which can be predetermined, between the fourth time and the first time. The fourth subsystem preferably represents the system at a previous time.
A dynamic response can be determined easily, particularly if at least some of the computation elements are artificial neurons.
Furthermore, from the computation point of view, it is particularly advantageous if, in one refinement, only one weight of weights which are in each case associated with a connection between an intermediate computation element and an output computation element has the value unity, and the other weights of the weights each have the value zero.
A development for determining a dynamic response of a dynamic process is preferably used.
One refinement has a measurement system for detecting physical signals, by means of which the dynamic process is described.
A development for determining the dynamic response of a dynamic process which takes place in a technical system, in particular in a chemical reactor, or for determining the dynamic response of an electrocardiogram (EKG), or for determining economic or macroeconomic dynamic responses is preferable used.
One development can also be used for monitoring or controlling the dynamic process, in particular a chemical process and one in which time series values can be determined from physical signals.