1. Field of the Invention
The present invention relates to a neural network, and more particularly, to a use of the neural network for an identification, a forecast, and a control of a non-linear flow on a physical system network.
2. Description of the Background Art
Conventionally, an identification, a forecast, and a control of a non-linear flow on a physical system network has been realized by a multi-layer perceptron in which various units forming the perceptron are provided on multiple layers. In the following, this conventionally utilized multi-layer perceptron will be described briefly.
In the multi-layer perceptron, each unit comprises a part for receiving outputs of the other units, a part for determining an internal state of this unit according to the outputs received, and a part for outputting the determined internal state of this unit applied with a non-linear transformation, as depicted conceptually in FIG. 1 for a case of the discrete-time continuous-output model, and in FIG. 2 for a case of the continuous-time continuous-ouput model.
In the discrete-time continuous-output model shown in FIG. 1, the unit 1 receives the outputs z.sub.1, z.sub.2, . . . , Z.sub.N from the other units 1, 2, . . . , N, and the internal state u.sub.i of this unit i is defined as a weighted sum of these outputs which can be calculated by using the corresponding connection weight values w.sub.i 1, wi.sub.i 2, . . . , w.sub.i N reflecting the connection state among the units, according to the following equation (1). ##EQU1##
The output z.sub.i of this unit i is then obtained by transforming this internal state u.sub.i by using a non-linear function f, according to the following equation (2). EQU z.sub.i =f(u.sub.i) (2)
On the other hand, in the continuous-time continuous-output model shown in FIG. 2, the internal state u.sub.i is defined by the following differential equation (3). ##EQU2##
The output z.sub.i of this unit i in the continuous-time continuous-output model is obtained according to the above equation (2), just as in a case of the discrete-time continuous-output model of FIG. 1.
An exemplary conceptual configuration of a three-layer perceptron formed by such units is shown in FIG. 3, which comprises: an input layer formed by three units, a hidden layer formed by two units, and an output layer formed by two units. This three-layer perceptron can be utilized for establishing the correspondence between an input in a form of a three-dimensional vector and an output in a form of a two-dimensional vector. In such a perceptron, all the units carry out basically the same type of the processing, with different settings of the connection weight values w.sub.i j. Therefore, it is necessary to determine the settings of the connection weight values w.sub.i j appropriately, so as to be able to obtain a desired output for a given input.
Conventionally, this determination of the appropriate settings of the connection weight values has been realized by using the error back-propagation algorithm which is a learning algorithm for minimizing the error between the desired output for the given input and the actual output obtained by the neural network, by using the steepest gradient descent method. In the following, this error back-propagation algorithm will be described briefly. (See D. E.Rumelhart et al., "Learning representations by back-propagating errors", Nature Vol. 323, pp. 533-536, 1986, for further detail.)
In the error back-propagation algorithm, each unit is assumed to be in the discrete-time continuous-output model of FIG. 1 described above, and for a given input (vector) X, a desired output (vector) is denoted by Y(X), while the actual output (vector) obtained by the neural network is denoted by Z(X). Then, an objective function E to be minimized is defined as a sum of a squared error of the actual output Z(X) with respect to the desired output Y(X), according to the following equation (4). ##EQU3##
Then, the connection weight values w.sub.i j for minimizing this objective function E can be obtained as the convergence points w.sub.i j (.infin.) of the solution of the following differential equation (5). ##EQU4##
Now, using the multi-layer perceptron as described above, the conventional method for an identification and a forecast of a non-linear flow on a physical system network constructed from N directed graphs as shown in FIG. 4 will be described.
In this physical system network shown in FIG. 4, the directed graphs are labelled by numbers i=1, 2, . . . , N (N=12 in FIG. 4), and a flow observed at a terminal point of a branch j at a time t will be denoted as Q.sub.j (t). Then, the flow to be observed at a terminal point of a branch j at a time t is determined from past records of the flows Q.sub.i (t-M.DELTA..tau.), . . . , Q.sub.i (t-.DELTA..tau.), where M is a natural number, .DELTA..tau. is a positive constant real number, and i=1, 2, . . . , N.
In the conventional method for an identification and a forecast of a non-linear flow on a physical system network, the correspondence between the flow at a given time and the past records of the flows is established by using the three-layer perceptron comprising an input layer formed by NM units, a hidden layer formed by an appropriate number of units, and an output layer formed by a single unit, as shown in FIG. 5 which shows a case of N=12 and M=2 (a number of units in the input layer is 12.times.2=24) and a number of units in the hidden layer is four. Note here that, although not explicitly indicated in FIG. 5, each unit of the hidden layer is connected with all the units of the input layer, and the unit of the output layer is connected with all the units of the hidden layer.
Then, the connection weight values are determined according to the above differential equation (5), by regarding the past records of the flows Q.sub.i (t-M.DELTA..tau.), . . . , Q.sub.i (t-.DELTA..tau.) as the input X(t), and the flow Q.sub.j (t) as the desired output Y(X(t)) for this input X(t), and using the objective function to be minimized defined by the following equation (6). ##EQU5##
In this manner, the flow at a time (t+.DELTA..tau.) can be forecasted from the past records of the flows from a time (t-(M-1).DELTA..tau.) to a present time t.
Next, using the multi-layer perceptron as described above, the conventional method for a control of a non-linear flow on a physical system network will be described by regarding the physical system network shown in FIG. 4 as a traffic network, in which four nodes A, B, C, and D are considered as crossroads equipped with traffic signals. Here, in correspondence to the "red" signal and the "green" signal of each traffic signal, the signal control parameter s.sub.k (k=A, B, C D; S.sub.k {0, 1} is defined.
Then, the correspondence of the flow Q.sub.j (t) to be observed at a terminal point of a branch j at a time t with respect to the flow Q.sub.i (t-T) (i=1, 2, . . . , N) and the signal control parameter s.sub.k (t-T) (k=A, B, C, D) at a time (t-T) is established by using the three-layer perceptron such as that shown in FIG. 5. Here, for example, the input layer can be formed by 12+4 units in correspondence to the 12 branches and 4 nodes used in the traffic network of FIG. 4.
Then, the connection weight values are determined according to the above differential equation (5), by regarding the flow Q.sub.i (t-T) and the signal control parameter s.sub.k (t-T) as the input X(t), and the flow Q.sub.j (t) as the desired output Y(X(t)) for this input X(t), and using the objective function to be minimized defined by the above equation (6).
In this manner, the flow Q.sub.j (t+T) forecasted for a crossroad of interest at a time (t+T) can be obtained by calculating Z(X(t)) from the flow Q.sub.i (t) and the signal control parameter s.sub.k (t) as the input X(t) for the present time t.
Consequently, the optimum control at the present time for maximizing the flow at a time (t+T) can be realized by calculating Z(X(t)) for all the combinations of the control parameters, and the selecting the setting of the control parameters for which the flow at a time (t+T) is forecasted to be maximum.
Now, the conventional method for an identification, a forecast, and a control of a non-linear flow on a physical system network described above is associated with various problems as follows.
First, the multi-layer perceptron utilized in the conventional method is designed quite independently from the connection state (topology) of the physical system network to be dealt with. As a consequence, when the connection state in the physical system network is partially changed, it is necessary to re-determine all the connection weight values according to the above differential equation (5).
Also, in a case of dealing with a large scale physical system network, i.e., a case in which the input and output vectors have large dimensionality, the number of units required in the hidden layers also becomes large, so that the size of the multi-layer perceptron itself becomes large. However, in the learning of the large scale perceptron using the steepest gradient descent method, the convergence of the solution of the above differential equation (5) to the local minimum of the objective function E is obtained at a very high probability, rather than the desired convergence to the global minimum.
In addition, in the conventional method for an identification, only the correspondence between the inputs and the outputs is established, so that it has been impossible to derive the system dynamics parameters specifying the dynamics of the physical system represented by the physical system network, such as the sink and the source at non-observed points on the physical system network, from the connection weight values determined by the learning. In other words, the conventional method has been addressing only the direct problem for the non-linear system, so that the inverse problem of the non-linear system cannot be handled by the conventional method.
Moreover, in the conventional method for a forecast, it has only been possible to make the forecast in units of prescribed unit time such as 15 minutes or 30 minutes, and it has been impossible to make the forecast for an arbitrary time. As a consequence, the conventional method for a control can also be carried out only in units of this prescribed unit time.