A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever. The following notice shall apply to this document: Copyright (copyright) 1999, Microsoft, Inc.
The present invention pertains generally to multidimensional data access methods, and more particularly to a system for mapping multidimensional space to one-dimensional space.
On-Line Analytical Processing (xe2x80x9cOLAPxe2x80x9d) is a key part of most data warehouse and business analysis systems. OLAP systems provide for the fast analysis of multidimensional information. For this purpose, OLAP provides for multidimensional access and navigation of data in an intuitive and natural way, providing a global view of data, but also allowing for fast drill down into data of interest. Speed and even response time is an important attribute of OLAP so that users can efficiently browse and analyze data on-line. Additionally, OLAP typically provides, for example, analytical tools to rank, aggregate, and calculate lead and lag indicators for the data under analysis.
Referring to FIG. 1, there is illustrated a multidimensional space with three dimensions: product, region and time, wherein each cell in the space represents the dollar value (values not shown) of the itemized products. The products in this example are stereos, televisions (TV""s) and personal computers (PC""s). The regions in this example are Seattle, San Francisco (SF), Los Angeles (LA), Toronto, Montreal, Paris, Nice, Mars, Rome, and Milan. The time dimension is divided into the four quarters of 1991 and 1992, with 1991 and 1992 being aggregations of their respective quarters. Using OLAP, a user might, for example, choose to view sales by continent, such as America or Europe, as may be obtained by adding the sales for each city in the respective continent, or drill down and view sales in USA and Canada. Alternatively, a user may desire to compare sales in the years 1991 and 1992, or drill down and compare sales quarter to quarter. By navigating through the data in this manner, a user of an OLAP system can quickly create many different views of data and, hopefully, gain insight and knowledge from these views.
From a design perspective, OLAP systems present the related issues of the user""s view and navigation of complex data, and how to store and represent the data to make viewing and navigation most efficient. As discussed in Analysis of the Clustering Properties of Hilbert Space-filling Curve, by Moon et al., available at URL http://www.cs.umd.edu/TR/UMCP-CSD:CS-TR-3611, the design of multidimensional access methods that fulfill the needs of OLAP is difficult as compared to one-dimensional cases because there is no total ordering that preserves spatial locality. When a suitable mapping function is applied to a given spatial database, a one-dimensional access method, such as B+-tree, may yield good performance for multidimensional queries. Referring to FIGS. 2A, 2B and 2C, there is shown three well known prior art space mapping functions, namely the z-curve, Gray coding and Hilbert curve. These methods generally attempt to cluster related multidimensional data in a linear storage medium such as a magnetic media or random access memory. For this purpose, all of these methods assume that the probability of fetching an adjacent member is uniformly distributed through dimensions of the database. However, this is not usually the case for data representative of measurements in real life. Also, these prior art methods tend to artificially divide dimensions by powers of two, which is also suboptimal in many cases.
As described more fully below, the embodiments of the invention provide for more efficient mapping of multidimensional data to the one-dimensional space of linear storage mediums. More specifically, these embodiments provide that multidimensional space is divided into volumes based on the priority of levels within the dimensions of the data. Spatial to linear mapping is then applied to the multidimensional data such that records or data with dimension members belonging to the same parent in the hierarchy will be close to each other across all dimensions.
According to yet another aspect of the invention, there is provided an on-line analytical processing system wherein data from a multidimensional space is stored in a one-dimensional space in a storage medium in accordance with the method outlined above.
According to still another aspect, the invention is embodied as a data structure wherein data from a multidimensional space is stored in a one-dimensional space in a storage medium in the structure outlined above.