This invention relates to an index construction method for data such as real time stream data, which is frequently inserted and deleted, particularly for data for which an index key value presents an increase/decrease tendency with fluctuation, and data for which a key tendency is switched.
There has been an increasing demand for a data processing system which carries out real-time processing for data continuously arriving at a database management system (hereinafter referred to as DBMS), which carries out processes for data stored in the storage system. For example, in a system for trading stocks, how fast the system can react to changes in stock prices is one of the most important objects, and a method such as the one carried out by a conventional DBMS, in which stock data is once stored in a storage system, and then the stored data is searched for, cannot immediately respond in correspondence with the speed of the changes in stock prices, and may result in losing business chances. For example, though U.S. Pat. No. 5,495,600 discloses a mechanism which issues stored queries periodically, it is difficult to apply this mechanism to the real time data processing for which it is important to execute a query immediately after data such as stock prices is input.
Data which continuously arrives is defined as stream data, and there has been proposed a stream data processing system as a data processing system preferable for the real-time processing for the stream data. For example, R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, and R. Varma: “Query Processing, Resource Management, and Approximation in a Data Stream Management System”, In Proc. of the 2003 Conf. on Innovative Data Systems Research (CIDR), (online), January 2003, (retrieved on Oct. 12, 2006), Internet URL <http://infolab.usc.edu/csci599/Fall2002/paper/DS1_datastreammanagem entsystem.pdf> discloses a stream data processing system “STREAM”.
In the stream data processing system, first, queries are registered to the system, and the queries are executed continuously when data arrives, which is different from the conventional DBMS. The above-mentioned STREAM employs an idea referred to as sliding window, which partially cut stream data for efficiently processing the stream data. As a preferred example of a query description language including a sliding window specification, there is a continuous query language (CQL) disclosed in R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, and R. Varma: “Query Processing, Resource Management, and Approximation in a Data Stream Management System”, In Proc. of the 2003 Conf. on Innovative Data Systems Research (CIDR), (online), January 2003, (retrieved on Oct. 12, 2006), Internet URL <http://infolab.usc.edu/csci599/Fall2002/paper/DS1_datastreammanagem entsystem.pdf>. The CQL includes an extension for specifying the sliding window by using parentheses following a stream name in a FROM close of a structured query language (SQL), which is widely used for the DBMS. As for SQL, there is known one disclosed in C. J. Date, Hugh Darwen: “A Guide to SQL Standard (4th Edition)”, the United States, Addison-Wesley Professional, Nov. 8, 1996, ISBN: 0201964260. There are two types of typical methods for specifying the sliding window: (1) a method of specifying the number of data rows to be cut, and (2) a method of specifying a time period containing data rows to be cut. For example, “Rows 50 Preceding” described in a second paragraph of R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, and R. Varma: “Query Processing, Resource Management, and Approximation in a Data Stream Management System”, In Proc. of the 2003 Conf. on Innovative Data Systems Research (CIDR), (online), January 2003, (retrieved on Oct. 12, 2006), Internet URL <http://infolab.usc.edu/csci599/Fall2002/paper/DS1_datastreammanagem entsystem.pdf> is a preferred example of the item (1), in which data corresponding to 50 rows is cut to be processed, and “Range 15 Minutes Preceding” is a preferred example of the item (2), in which data for 15 minutes is cut to be processed. The stream data cut by the sliding window is retained on a memory, and is used for the query processing.
To accelerate processing, the conventional DBMS constructs an index such as a B-tree index. As the B-tree index, one disclosed in R. Elmasri, S. B. Navathe: “Fundamentals of Database Systems, 3rd edition, the United States, Addison-Wesley Professional, August, 1999, ISBN: 0805317554 is known. If keys whose value monotonically increases are inserted into the B-tree index, a node is split in order to halve the number of the keys, and there thus arises a problem in that a half of an area for index is not used. To solve this problem, there has been proposed a method to efficiently construct an index for monotonically increasing data by unevenly splitting a node at a key insertion position. A technique to split a node at a key insertion position is disclosed in U.S. Pat. No. 5,644,763.
The application of the stream data processing system is expected in fields in which the real time processing is required, and is typified by financial applications, traffic information systems, traceability systems, sensor monitoring systems, and computer system management.