1. Field of the Invention
The present invention relates generally to dynamic update cubes for range-sum queries, and more particularly to a hybrid query search method, which provides a precise answer or an approximate answer with respect to On-Line Analytic Processing (OLAP) queries by using a delta (Δ)-tree, which has a multidimensional index structure and a prefix-sum cube, so as to effectively support range-sum queries widely used in opinion decisions in enterprises.
2. Description of the Prior Art
Generally, On-Line Analytic Processing (OLAP) is a category of database technology allowing analysts to gain insight on an aggregation of data through access to a variety of possible views of information. This technology often requires the summarization of data at various levels of detail and with various combinations of attributes.
Typical OLAP applications include product performance and profitability, effectiveness of a sales program or marketing campaign, sales forecasting, and capacity planning. Among various OLAP application areas, a data model for the multidimensional database (MDDB), which is also known as a data cube, becomes increasingly important.
The data cube is constructed from a subset of attributes in the database. Certain attributes are chosen to be measure attributes, i.e., the attributes whose values are of interest. Other attributes are selected as dimensions or functional attributes. The measure attributes are aggregated according to their dimensions.
For example, consider a data cube maintained by a car-sales company. It is assumed that the data cube has four dimensions, i.e., MODEL_NO., YEAR, REGION, and COLOR, and one measure attribute, i.e., AMOUNT_OF_SALES. In this case, assume that the domain of MODEL_NO. contains 30 models, YEAR ranges from 1990 to 2001, the REGION dimension contains 40 regions, and COLOR lists white, red, yellow, blue, gray, and black. The data cube therefore has 30×12×40×6 cells, and each cell contains AMOUNT_OF_SALES as a measure attribute for the corresponding combination of four functional attributes, i.e., MODEL_NO., YEAR, REGION, and COLOR.
The data cube provides a useful analysis tool on data called a range-sum query that applies an aggregate operation to the measure attribute within the range of the query.
A typical example may include “Find the total amount of sales in Seoul for all models of red color between 1995 and 2000”. Queries of this form for obtaining the range-sum are very popular, and the response time is very important for the OLAP application, which needs user-interaction with respect to a corresponding query.
A direct method for processing the range-sum query is to access the data cubes themselves.
However, this methods suffers from the fact that too many cells require access to calculate the range-sum, and at this time, the number of cells to be accessed is proportional to the size of a sub-cube defined by the query.
To improve the direct method and enhance search efficiency, a prefix-sum method using a prefix-sum cube (PC) has been proposed. The prefix-sum method focuses on reducing the search cost.
Current enterprise environments force data elements in the cube to be dynamically changed. In such environments, the response time is affected by the update time as well as by the search time of the cube.
Therefore, various methods have been proposed to reduce the update cost by improving the prefix-sum method. These methods use additional data structures such as a relative prefix-sum cube (RPC) to minimize the update propagation over the prefix-sum cube (PC).
However, these methods have some limitations in that, even though they reduce the update propagation, to some degree improving the update speed, the update speed is not sufficiently improved, because the RPC is merely a slight transformation of the PC. Furthermore, these methods are problematic in that their search efficiencies are decreased so as to accomplish their accelerated update capabilities.
Therefore, in many OLAP applications, it becomes an important issue to improve the update performance while minimizing sacrifice in the search efficiency.
As described above, various new methods addressing the query on an OLAP data cube are proposed in the prior art.
An elegant algorithm, which is called the prefix-sum method, for computing range-sum queries in data cubes, is proposed in a thesis entitled “Range queries in OLAP Data Cubes” by C. Ho, R. Agrawal, N. Megido and R. Srikant and published in “Proceedings of ACM SIGMOD Int'l Conference on Management of Data,” 1997”, pp. 73–88 (hereinafter, the thesis is referred to as HAMS97). The essential idea of is to pre-compute many prefix-sums of the data cube and to use these pre-computed results for answering arbitrary queries at run-time.
However, even though the prefix-sum method reduces response times for queries, it is very expensive to maintain the prefix-sum cube when data elements in the cube are changed.
In order to reduce the update propagation in the prefix-sum cube, a method using the relative prefix-sum cube (PRC) is proposed in the thesis “Relative prefix-sums: an efficient approach for querying dynamic OLAP Data Cubes” by S. Geffner, D. Agrawal, A. El abbadi, and T. Smith and published in “Proceedings of Int'l Conference on Data Engineering, Australia,” 1999, pp. 328–335 (hereinafter, the thesis is referred to as GAES99). This method attempts to balance the query-update tradeoff between the prefix-sum method and the direct method. However, this method is problematic in that it is impractical in data cubes of large dimensions and high capacity since the update cost increases exponentially.
In order to solve the problem, a new class of cube called Hierarchical Cube (HC), based on two orthogonal dimensions, is proposed a thesis entitled “Hierarchical cubes for range-sum queries” by C. Y. Chan, and Y. E. Ioannidis and published in “Proceedings of Int'l Conference on Very Large Data Bases,” Scotland, 1999, pp. 675–686 (hereinafter, the thesis is referred to as CI99). A hierarchical band cube described in this thesis has a significantly better query and update trade-off than that of the algorithm proposed in the thesis GAES99. However, this method is problematic in that an index mapping from a high level “abstract” cube to a low-level “concrete” cube is too complicated to be successfully implemented.
More recently, a dynamic data cube designed by recursively decomposing the prefix-sum cube is proposed in a thesis entitled “The dynamic Data cube” by S. Geffner, D. Agrawal, and A. El Abbadi and published in “Proceedings of Int'l conference on Extending Database Technology,” Germany, 2000, pp. 237–253 (hereinafter, the thesis is referred to as GAE00). In the thesis GAE00, it is assumed that each dimension of a data cube is of the same size. Further, a tree structure is constituted by such a decomposition technique.
However, the data cube of a practical environment, like the example of car-sales company described above, has dimensions of different sizes, as such makes it difficult to keep the balance of the tree while decomposing the prefix-sum cube. Further, if the data cube is of high dimensions and high capacity, this method may cause problems, such as the incurrence of a computation overhead.