1. Field of the Invention
The present invention relates to a recurrent neural network, and also to a system and method of estimating the trend of a change in measurement discontinuously variable in time.
2. Description of the Related Art
The Kalman filter has been conventionally used as an estimation filter. It is a classic method of identifying a system, and is still being developed for various applications. There are various methods of applying a neural network being newly established as a static nonlinear system identifying method for estimating nonlinear time-series data. However, the application of the Kalman filter is limited by the capabilities of the current computers, and the applications relating to the time-series analysis of the neural networks are new and have the following problems.
(1) Method using Kalman Filter
When discontinuous changes are made in the trend of time-series data or when a Gaussian type cannot be estimated for the noise representing the uncertainty of a model, a normal linear Gaussian Kalman filter does not properly estimate or filter data (refer to Time-series Analysis Programming by Genshiro Kitagawa, in Iwanami Computer Science published in 1993). If a discontinuous state change is processed by a linear Gaussian model, then an extremely large dimensional model is required. At this time, it is difficult to set a standard for objectively selecting the dimension of the model.
Recently, to solve the above described problems, a nonlinear and non-Gaussian generalized Kalman filter has been introduced (Genshiro Kitagawa. Non-Gaussian State-Space Modeling of Nonstationary Time Series. Journal of the American Statistical Association, 82 (400): 1032-1041, 1987.). The generalized Kalman filter is successfully used in estimating and smoothing a discontinuous change in trend or a noise. However, to operate the generalized Kalman filter, the distribution of the estimating, filtering, and smoothing should be directly computed. Therefore, when a larger state-space model is used, it takes a longer time in identifying a filtering coefficient appropriate for the generalized Kalman filter. On the other hand, since the probability distribution required for the computation in each step can be determined using the linear Gaussian model only by estimating a mean value and distribution, the volume of computation for the identification can be reduced but identification objects are limited. Additionally, to effectively apply the generalized Kalman filter, prior knowledge is required for the distribution in which noises including an abnormal value can be appropriately represented.
However, using a Monte Carlo filter for estimating the distribution of the noises from a sample in the bootstrap method, data can be appropriately estimated or smoothed even if prior knowledge is lacking for noises (Genshiro Kitagawa. A Monte Carlo Filtering and Smoothing Method for Non-Gaussian the receives Nonlinear State Space Models. Research Memorandum 462, The Institute of Statistical Mathematics, 12 1993.). Using the Monte Carlo filtering method, a methodology for common nonlinear non-Gaussian time series data is being established. However, the time required in computing the probability distribution through a resampling process is significantly lengthened.
(2) Method using a Neural Network
Data is retrieved from time-series data through a time window. Then, a series of patterns are generated to indicate small differences in a time series so that the time series data are learned through the feed-forward neural network and back propagation (A. Waibel. Modular Construction of Time-Delay Networks for Speech Recognition. Neural Computation, 1:382-399, 1989./Jeng-Neng Hwang, Shyh-Rong Lay, Martin Maechler, R. Douglas Martin, and James Schimert. Regression Modeling in Back-Propagation and Projection Pursuit Learning. IEEE Transactions on Neural Networks, 5(3):342-353, May 1994.). There is the problem in exactly learning the time series data in this method that the scale of the neural network is large and the storage areas are running short. The problem is caused by representing the relationship between input and output data by a weight value of the neural network. Furthermore, another problem arises. That is, definite descriptions cannot be made from the viewpoint of generating a probability structure of time-series data.
To solve the problems of network scales, a recurrent neural network having a feedback structure has been designed (Jerome T. Connor, R. Douglas Martin, and L. E. Atlas. Recurrent Neural Networks and Robust Time Series Prediction, IEEE Transactions on Neural Networks, 5(2):240-254, March 1994.). There are two main types of recurrent neural networks. That is, a method in which an output layer recurs (Jordan method), and a method in which an intermediate layer recurs (Elman method). The recurrent neural network is specifically provided with a layer for storing recurrent information. The layer is referred to as a context layer.
The feedback structure solves the scale problem, but it is not certain in what scale and density of the context layer should the past information history recur to obtain an appropriate estimation filter. Furthermore, the operations and estimation for unknown time-series data (data generated in the same probability structure as the data used in identifying a parameter) are not clearly defined. It is obvious that a common network connection requires a high spatial computation cost, and a large volume of information is required to compute a differential coefficient, etc. when each type of coefficient is retrieved.
There is a method of configuring an estimation filter using a recurrent neural network having a restriction structure similar to that of an autoregressive moving average (ARMA) model (James Ting-Ho Lo. Synthetic Approach to Optimal Filtering. IEEE Transactions on Neural Networks, 5(5):803-811, September 1994./G. V. Puskorius and L. A. Feldkamp. Recurrent Neural Networks with the Decoupled Extended Kalman Filter Algorithm. Science of Artificial Neural Networks, 1710:461-473, 1992.). In this case, an internal state of a neural network can be interpreted through a normal Kalman filter. There is also a method of selecting a parameter appropriate for given data while computing the error of a given parameter through operations according to the least-squares-error criterion and computation using the Kalman filter. However, this technology has a problem of the volume of the computation using the Kalman filter. Additionally, there are a number of unclear points on the relationship between the network internal state and time-series data, thereby making a difficult problem of interpretation of an internal state.