1. Field of the Invention
This invention relates to a system for predicting and analyzing economic phenomena, such as variations in stock prices, bond prices and exchange rates, using a neural network.
2. Description of the Related Art
FIG. 2 of the accompanying drawings shows an economic phenomenon predicting and analyzing system, which is disclosed in, for example, "Stock Market Prediction System with Modular Neural Networks", by T. Kimoto and K. Asakawa, in Proceedings of International Joint Conference on Neural Networks, June 1990. This system predicts variations in TOPIX and analyzes causes for the variations. TOPIX (Tokyo Stock Exchange Prices Index) is a kind of stock index used in Japan and is a stock index for stock on the market in Japan.
This system comprises two subsystems: a prediction system 1 and an analysis system 2. The prediction system 1 is composed of a preparation module 3, a number of neural networks 4 connected to the back stage of the preparation module 3, and an unification module 5 for obtaining a weighted average of the outputs of the neural networks 4.
The preparation module. 3 inputs time series data 6 indicating the time variation of TOPIX in a predetermined past period. The preparation module 3 further inputs various time series data 7-1-7-n indicating the past time variations such as of turnover, interest rate, foreign exchange rate and New York Dow-Jones average. Moreover the preparation module 3 performs a logarithmic arithmetic operation and a normalization arithmetic, an error function arithmetic on the input time series data and then supplies them to the individual neural networks
Each neural network 4 has an hierarchical structure including an input layer 8, a hidden layer 9 and an output layer. In FIG. 2, each of the output layers is composed of a single output neuron 10.
Each neural network 4 cannot be used for prediction until it is provided with learning. The learning of the neural network 4 is provided according to a so-called back propagation method. During this process, learning data including two kinds of data, i.e. input data and teaching data have to be given to the neural network 4.
Data indicating economic phenomena that have actually occurred in the past are used as the input data, and data indicating economic phenomena that have actually occurred following the past economic phenomena are used as the teaching data. More specifically, the input data are time series data indicating variations of TOPIX in a period and data indicating variations such as of turnover, interest rate, foreign exchange rate and New York Dow-Jones average in the same period, and the teaching data are data indicating actual variations of TOPIX in a period following the previous period.
In the system of FIG. 2, since the preparation module 3 is located on the front stage of the neural network 4, the time series data indicating variations of TOPIX of the input data are given to the preparation module 3 as input data 6 and are then input to the neural network 4. Likewise, of the input data, the time series data indicating variations such as of turnover, interest rate, foreign exchange rate and New York Dow-Jones average are given to the preparation module 3 as input data 7-1-7-n and are then input to the neural network 4.
For learning, all prepared learning data is repeatedly input to the neural networks 4. As all learning data is thus repeatedly given according to the back propagation method, the individual neural network 4 is self-organized. Upon termination of this learning, the individual neural network 4 will be an organization which is provided with learning of past economic phenomena by experience. Therefore, after termination of the learning process, the economic time series data 6, 7-1-7-n are given successively to the preparation module 3 so that the result of prediction of a future economic phenomenon based on the experience of the past economic phenomena can be obtained from the neural networks 4 via the unification module 5.
The unification module 5 calculates the weighted average of the outputs of the output neurons 10 of the individual neural networks 4. Specifically, the unification module 5 performs the following arithmetic operation:
Firstly, the rate of increase of TOPIX at a time t is represented by TOPIX (t)/TOPIX (t-1), where TOPIX (t) is a stock index at a week t. The unification module 5 obtains a logarithm, usually a natural logarithm, of this value. In other words, .gamma..sub.t is obtained by the equation (1): EQU .gamma..sub.t =ln(TOPIX (t)/TOPIX (t-1)) (1)
Secondly, return .gamma..sub.N (t) at a time t is obtained using the following equation (2): EQU .gamma..sub.N (t)=.SIGMA..phi..sup.1 .gamma..sub.t+1 (i=1, ...N)(2)
where .phi..sup.1 is the weight of a natural logarithm .gamma..sub.t+1 of the rate of increase of TOPIX at a time t+i. .phi..sup.1 is determined within a range of 0.5 to 1, and so as to decrease as i is large. Since "i is large" means "it is a distant future", the equation (2) is a weighted average operating equation which evaluates the return .gamma..sub.N (t) to be smaller as it is a more distant future. The unification module 5 outputs the return .gamma..sub.N (t) to be obtained as the result of this weighting arithmetic operation.
Accordingly, the output 11 of the unification module 5 will be an index indicating the rate of increase of weighted average of TOPIX in a predetermined period after the present time. This value will be positive if the stock price increases in the future and negative if the stock price decreases in the future. From the unification module 5, the return .gamma..sub.N (t) as significant data for estimating economic trends from the present time (after a time t) can be obtained as the output 11. A period to give the data 6, 7-1-7-n to the prediction system 1 should preferably be a week.
Further, the above-identified publication discloses a method of analizing the causal relationship between the time series input data 6, 7-1-7-n and the output value 11 using the individual neural networks 4 provided with learning. In this method, the number of neurons of each hidden layer 9 is predetermined to be small (e.g., five). For analizing the causal relationship between the time series input data 6, 7-1-7-n and the output of the individual neural networks 4 the time series input data 6, 7-1-7-n for learning are input to the analysis system 2 as input data 12 and corresponding outputs of the individual hidden layers 9 are input to the analysis system 2 as hidden multivariate analysis is performed over the input data 12 as independent variables and outputs 13 as dependent variables. Specifically, a cluster analysis is made over vectors of the outputs 13 of the individual hidden layers 9. In the learning process to give various learning data 12, the similar outputs 13 for different learning data 12 can be obtained from the hidden layers 9. The set of such learning data is called a cluster. The analysis system 4 is a system for obtaining the causal relation between the time series input data 6, 7-1-7-n and the output value 11 by sorting the clusters.
Generally, the greater the number of neurons constituting each of the hidden layers 9, the higher the degree of prediction accuracy of the prediction system 1 that will be obtained. On the other hand, when the number of neurons constituting each of the hidden layers 9 is set as many clusters to cluster, redundancy will be eliminated, and therefore a cluster analysis using the outputs 13 of the hidden layers 9 as shown in FIG. 2 will become easy. The number of clusters is usually several and hence the number of neurons of each of the hidden layers 9 is determined to be small so as to meet the number of clusters. Thus the redundant neurons are eliminated.
This analysis system 2 is convenient for providing a detailed analysis of economic phenomena. Specifically, by researching the learning data 12 sorted into individual clusters by the analysis system 2, a more detailed analysis of economic phenomena can be achieved. For example, by researching the date of the learning data 12 belonging to the individual cluster on the time series data of stock indices, it is possible to determine the kind of market (i.e., a bull market, a stagnant market and a bear market) corresponding to the cluster. Further, by researching the frequency distribution of the input data (learning data) belonging to the individual cluster, it is possible to research the main factor of occurrence of market trends corresponding to the individual cluster. Namely, assuming that the only input data (learning data 12) sorted into this cluster, are data which belong to the input data (learning data 12) in a predetermined distribution, it can be considered that if the content is deviated, the values of the input data (learning data 12) near the deviated input data are ones of the factors occurrence of market trends corresponding to the cluster.
In the prior system, it is preferable that the number of neurons constituting each of the hidden layers is as many as the clusters to which the learning data is to be sorted. However, it is possible to anticipate how many clusters the input data should actually be sorted into. Therefore, the number of neurons has to be determined by a trial-and-error experimental method.
Since the number of clusters is only a few, the number of hidden layer neurons was restricted in order to satisfy the demand of the analysis system. However, to improve the prediction performance, the number of hidden layer neurons is preferably large. In the prior system, therefore, demands for removing redundant neurons by the analysis system have been the bottle-neck in improving the prediction performance.
Further, with the prior system, analyses such as a technical analysis could not be performed. A technical analysis, like fundamental analysis, is one of the classic economic analyses and a method for checking variation patterns of stock prices in the past using, for example, a chart and grasping variations of stock prices after the present using the past variations. This method adopts the conventional statistic method and enables the obtaining of the causal relation between the variations of input data and the output. However, with the prior system using neural networks, since the technical analysis could not be performed, the causal relationship between the past variation pattern and the future variation and the causal relation between the variations of input data and the output would not be obtained.