In general, text data having time data is called time series data. Examples include business report data recording daily business activity of a business person in a business area, nursing record data recording changes in the condition of a patient and nursing contents for the patient in the medical area, inspection records automatically recorded in a Web server, and Web log data recording contents of Web page in information processing area.
As apparatus for multi-dimensionally analyzing data stored in a database, a software called OLAP (On Line Analytical Processing) is known. Furthermore, an apparatus for realizing function of the OLAP tool is disclosed in Japanese Patent Disclosure (Kokai) P2002-183178 “Device and method for supporting data analysis and storage medium”. In this apparatus, data matched with an indicated condition can be set as a group through a visual programming environment, and data in the group can be analyzed. However, the data is only grouped in order to easily execute the count processing. In other words, the data can not be analyzed in time series.
Furthermore, GSP (Generalized Sequence Patterns) algorithm is disclosed in (Ramakrishnan Srikant and Rakesh Agrawal, “Mining Sequential Patterns: generalization and performance Improvements” in Proceeding of the 5th International Conference Extending database Technology, 3-17, 1996). In this algorithm, by previously indicating a criterion based on a frequency of a sequential pattern, characteristic sequential pattern can be found from a sequential data set. However, the sequential pattern is exhaustively found based on the indicated criterion. Accordingly, it takes a long time to find the sequential pattern. In addition to this, if a frequency of a characteristic pattern is slightly below the indicated criterion, the characteristic pattern can not be found. Accordingly, a possibility to miss the characteristic pattern exists.
Briefly, in the time series data analysis apparatus of prior art, data can not be analyzed in time series. Furthermore, in the GSP, it takes a long time to find the sequential pattern because the sequential pattern is exhaustively found based on the indicated criterion.