1. Field of the Invention
This invention relates in general to database management systems performed by computers, and in particular, to an active cache approach to caching multi-dimensional data sets for an on-line analytical processing (OLAP) system that uses a relational database management system (RDBMS).
2. Description of Related Art
(Note: This application references a number of different publications as indicated throughout the specification by reference numbers enclosed in brackets, e.g., [x]. A list of these different publications ordered according to these reference numbers can be found in the xe2x80x9cDetailed Description of the Preferred Embodimentxe2x80x9d in Section 9 entitled xe2x80x9cReferences.xe2x80x9d Each of these publications is incorporated by reference herein.)
On-Line Analytical Processing (OLAP) systems provide tools for analysis of multi-dimensional data. Most systems are built using a three-tier architecture, wherein the first or client tier provides a graphical user interface (GUI) or other application, the second or middle tier provides a multi-dimensional view of the data, and the third or server tier comprises a relational database management system (RDBMS) that stores the data.
Most queries in OLAP systems are complex and require the aggregation of large amounts of data. However, decision support applications in OLAP systems need to be interactive and demand fast response times. Different techniques to speed up queries have been studied and implemented, both in research and industrial systems. These include pre-computation of aggregates in the RDBMS, having specialized index structures, and caching in the middle tier.
The problem of pre-computing a cube has been studied in [AAD+96], [ZDN97], and [RS97]. [SDNR96] deals with the issue of the space required for pre-computation. Picking GROUP-BYs to pre-compute has been studied in [HRU96] and [SDN98]. [RKR97] and [KR98] consider the problem of efficient organization of the cube data.
In the field of caching, [SSV] presents replacement and admission schemes specific to warehousing. The problem of answering queries with aggregation using views has been studied extensively in [SDJL96]. [SLCJ98] presents a method for dynamically assembling views based on granular view elements which form the building blocks.
Semantic query caching for client-server systems has been studied in [DFJST]. A recent work on semantic caching is based on caching Multidimensional Range Fragments (MRFs), which correspond to semantic regions having a specific shape [KR99]. Each dimension in a MRF either covers the entire range on the dimension or is a point selection on the dimension.
Another kind of caching is chunk-based caching, which is a semantic caching method optimized for the domain of OLAP systems. Chunk-based caching was proposed in [DRSN98]. The motivation of chunk-based caching is to allow a query to take advantage of overlap with previous queries, even if the later queries are not totally contained in the previous queries.
Generally, these different caching techniques have focused on using cached results from a previous query as the answer to another query. This strategy is effective when the query stream exhibits a high degree of locality. Unfortunately, it misses the dramatic performance improvements obtainable when the answer to a query, while not immediately available in the cache, can be computed from data in the cache. The present invention considers answering queries by aggregating data in the cache.
The present invention discloses a method, apparatus, and article of manufacture for caching multidimensional data sets for an on-line analytical processing (OLAP) system. An xe2x80x9cactive cachexe2x80x9d is used, wherein the cache can not only answer queries that match data stored in the cache, but can also answer queries that require aggregation or other computation of the data stored in the cache.