1. Field of the Invention
The present invention relates to a data analyzing device and method for detecting the relationship between data which is widely used in the industrial fields and extracting a significant result for producing industrially superior results and to a program for making a computer execute the data analyzing method.
2. Description of the Related Art
In analysis of numerical data, in many cases the data distribution (particularly the greatness or smallness of the value) indicates a certain characteristic rather than showing a random distribution. Accordingly, if such a certain characteristic can be efficiently extracted from the data distribution, it is possible to obtain industrially superior information. Practically, most collected data have temporal variation. In particular, such a temporal variation is important in manufacturing process data. In the data analysis, it is important to determine whether the temporal variation in data has a random pattern or a characteristic pattern. If the temporal variation is characterized, it is desirable to efficiently extract the information relating to the characteristic. In particular, in semiconductor manufacturing process operations for efficiently detecting the temporal variation and the factor thereof from test results or various measurement results such as yield having continuous values so as to establish a counterplan thereof are performed in business in order to produce superior results. In such a semiconductor manufacturing process, yield which is numerical data, performance, and various variables relating thereto are examples of the data subjected to the data analysis.
Generally, the temporal variation of the various variables can be detected by drawing a trend graph in which the variable subjected to the data analysis is set as the vertical axis and time is set as the horizontal axis. In the trend graph, an area is identified in which the fluctuation pattern of the variable or the value of the variable is distinctively different from that of the other areas. For example, in the case of the trend graph showing the yield of the semiconductor manufacturing process or the like, information such as the fluctuation pattern of the yield, for example, serves as a very important clue that leads to an improvement in the manufacturing process. Accordingly, industrially superior result can be produced by efficiently extracting an area and the characteristics thereof having the different fluctuation pattern of the variable and the values of the variable in comparison with the other areas, that is, the area having the extreme value, from the temporal variation of the variable having continuous values.
In addition, information about degree of statistical significant difference between values of variable in one area and values of variables in the other areas is widely adopted and particularly effective used as the information of the temporal variation. For example, in the semiconductor manufacturing process, if the area with low yield in production is found, the information about the statistical significant difference corresponding thereto can be used to extract an area in which a device was abnormally operating or an area in which a defective device was used. Accordingly, the information is important.
[Patent Document 1] JP-A-2004-186374
[Patent Document 2] JP-A-2001-306999
The conventional technology has problems as follows.
First, the data analysis using the trend graph requires a lot of variables to be identified. In order to extract more information, temporal variations of the same variables should be regarded as different trend graphs of the temporal variations were obtained under different devices or different conditions. Therefore the number of trend graphs corresponding to the combination of variables, devices, and conditions may increase greatly. Accordingly, to extract the variables and the corresponding areas (i.e., a time zone) in which the value of the variable is distinctively different from that of the other areas, one who analyzes the data, an engineer, has to investigate a large number of the trend graphs. Therefore, a lot of analyzing processes for an engineer are required to investigate the trend graph one by one for each variable.
Additionally, the data analysis using the trend graph does not employ a quantitative indicator. Accordingly, when the engineer investigates the respective trend graphs corresponding to a lot of the variables, the engineer may find difficulties in determining a variable to be identified and a variable to be used for extracting an area in which the values of the variable is distinctively different from that of the other areas. Consequently, precision of the data analysis may deteriorate.
Patent Document 1 discloses a manufacturing data analyzing method and a program for making a computer execute the same for efficiently extracting the information relating to the temporal variation of the variable having continuous values without using the trend graph. The manufacturing data analyzing method disclosed in Patent Document 1 provides an indicator (i.e., discriminative tracking feature: DTF) that indicates whether the temporal variation of the variable has a random pattern or a characteristic pattern. Particularly, if the temporal variation of the variable has a characteristic pattern, in many cases, it may be effective to perform the data analysis by focusing on the temporal variation of the variable.
Although the manufacturing data analyzing method disclosed in Patent Document 1 provides an indicator indicating whether the temporal variation is random or not, the method is not effective in judging whether there is an area having a statistical significant difference larger than that of the other areas, and extracting the area. In the manufacturing data analyzing method disclosed in Patent Document 1, in order to extract the area having the statistical significant difference larger than that of the other areas, it is necessary to investigate the trend graph even if the DTF has a large value. Particularly, if the investigating requires a relatively long period is required, it is necessary to investigate the trend graph for a discontinuous area while scrolling down a display screen. Accordingly, the number of processes are required and the precision of analysis decreases.
Additionally, in the data analysis by using the trend graph, it is difficult to decide position at which the distribution of numerical data on the trend graph is suitably partitioned into a large-valued area and a small-valued area. That is, it is difficult to judge which method for partitioning area can maximize the statistical significant difference between two areas. Accordingly, the effective method based on a quantitative standard is desirable. In addition, it is desirable to extract the variable and information of the corresponding area which are distinctively different from the other variables and areas, before investigating the trend graph for every variable. The invention is contrived to solve these problems.