1. Technical Field
The present invention relates generally to memory access and, more specifically, to a method and system for caching multi-dimensional data.
2. Background of the Invention
Relational databases usually include a plurality of tables that are searched (xe2x80x9cqueriedxe2x80x9d) using a well-known query language, such as the Structured Query Language (SQL). Relational databases, however, do not allow a user to selectively extract and view data from different points of view. To organize and summarize data for efficient analytical querying, a concept of a cube is used.
A cube contains one or more dimensions and one or more measures. Measures are central values in a cube that are analyzed, such as sales, profit, costs of goods sold or inventory count. A cube provides a logical, easily understood mechanism for querying data. A cube allows a user to extract and view data from different points of view. Dimension levels are a powerful tool, allowing users to ask questions at a high level and then expand a dimension hierarchy to reveal more details. Using a drill down/drill up technique a user may navigate through levels of data ranging from the most summarized (up) to the most detailed (down).
When a user requests data from one area of a cube, he will probably also be interested in viewing data that clusters around that area of the cube. To retrieve such data, however, a number of individual queries need to be submitted to a database. Conventional caching approaches allow for caching each database address and a value corresponding to a measure attribute. Such an approach works acceptably well with cubes having a small number of dimensions. However, the number of possible stored measures grows exponentially in cubes with a large number of dimensions. Therefore querying and caching each pairxe2x80x94an address and a value corresponding to a measure attributexe2x80x94results in a large number of single measure queries against the database.
What is needed is a way to increase the efficiency of data access in a database.
A described embodiment of the present invention provides a system, method, and a computer program product for caching multi-dimensional data in a data cache. The described embodiment uses a known a multi-dimensional construct, a cube, to represent the dimensions of data available to a user. This construct may have one or more dimensions. When the user submits a query or request for data, the request is converted to a set of canonical addresses and a set of cubelet addresses corresponding to their location in the cube. The described embodiment defines a region of related data in a cube to be a cubelet. A cubelet is a collection of values of a corresponding measure attribute and their associated canonical addresses. A cubelet address is the unique name for a cubelet that both uniquely identifies the cubelet, and identifies its location in the cube. A canonical address is the address of a single cell in the cube, and uniquely identifies one set of measures in the cube.
In a described embodiment, an execution module probes a data cache based on a cubelet address to determine if that portion of the cube has previously been cached. If so, the data cache returns the cubelet, which may contain more data than requested in the query. The execution module then probes the cubelet for the requested data and returns the requested data to the user. If the cubelet identified by the cubelet address is not found in the data cache, a fault handler queries a back-end database for data. The database returns a result set, which includes the requested data and the data for xe2x80x9cnearby cells.xe2x80x9d The returned data is stored in the data cache in the form of a cubelet. Different cubelets may represent different levels of data in the database.