The present invention relates to a time series database processing system, of an especially ultra-large scale, for storing data pieces serving as updating detailed information in a sequence of time series in a database and for controlling addition/deletion/retrieval of data.
When data pieces are loaded on a database of a large scale and a specified data piece is retrieved from the database, an index is generally applied. Indexing is effective when an item serving as a key during retrieval can be specified. The indexing is a contrivance in which specified key items of a database are collected, a pointer is provided over the key items to take the form of a balanced tree (B tree), and the tree can be traced at a high speed up to a location corresponding to a leaf of the tree in accordance with information indicating which range a key of a specified value lies in. “An Introduction to Database Systems, 3.4 Indexing” by C. J. Date, Addison-Wesley, 1986, pp. 68-77 teaches a contrivance in which information corresponding to storage locations of all data items can be obtained for all the data items. If the database is for about million cases or events, there occurs no problem. But in a database of an ultra-large scale for billion cases or trillion cases, however, the maintenance of index per se swells, and especially, keys which are added in time series fashion may not be handled well.
When data pieces are added in time series fashion, the indexing grows in a direction in which time increases, as shown in FIG. 1. Further, from the standpoint of deletion, it is known that as deletion of indices for which a constant time has expired proceeds, data pieces remain at only one side portion of the indexing tree and values of items are lost in spite of the existence of nodes on the other side portion, thereby placing the indexing in very inefficient condition. In such an event, it is necessary that the indexing be reconstructed by a technique called reorganization to delete wasteful areas in the indexing and promote the efficiency. But in the time series database of ultra-large scale, this is not practical because work far exceeding the permissible range is required.
A utility for data loading uses a technique for writing data directly to a physical area of a database and therefore, with this utility, data can be written at a high speed. However, the utility for high-speed data loading generally inhibits direct data writing to the physical area during data loading from a conflicting area at other retrieval or updating access. In other words, data loading shall compulsorily be executed while inhibiting access to a specified table for retrieval/updating or a part of a table for retrieval/updating. This forces retrieval of the database to be once stopped each time that time series data is loaded, which can be on a daily basis. In a database of ultra-large scale, it takes one day or more for retrieval per se in some applications. In that case, data loading cannot be permitted unless retrieval is stopped, leading to fatal inconvenience. To avoid such situations, data can be added through usual data insertion operation without resorting to data loading, but in this case the performance is degraded by approximately by one order as compared to data loading of a physical writing type. Besides, locking must be acquired for concealing data during addition, largely affecting the performance of operation for retrieval of all cases or events in the database.
In order to delete a data piece in the database for which a constant time has expired, the data piece is typically required to be retrieved, and even in the case of an index, the time consumed in comparison to that for inserting data piece by piece is significant. In the absence of index, all data pieces are retrieved for the purpose of deleting a data piece of interest and consequently, in the database of ultra-large scale, it takes one day or more to operate only the deletion processing and practically, the time series database cannot be materialized.
Thus, for the deletion of data for which a constant time has expired, time exceeding that for retrieval of all pieces of data is consumed in the absence of an index but conversely, in the presence of an index, indexing is updated during deletion, leading to an operation which consumes much time as in the case of data insertion. Accordingly, it is practically difficult to realize daily data deletion for the database which takes one day or more to retrieve all data pieces.