The present disclosure relates generally to the field of multidimensional analytical queries, and more particularly to managing the percentage of unpopulated cells in a multidimensional data structure during the servicing of multidimensional analytical queries. Transforming data into meaningful and useful information, for example, to locate historic trends, sometimes requires the analysis of a large data set. Multidimensional analysis is a statistical technique that can decrease the time required to analyze a large data set. One approach to servicing multidimensional analytical queries swiftly involves Online Analytical Processing (hereinafter “OLAP”) systems, which is technology that is used to create decision support software. OLAP utilizes information that has been summarized into multidimensional views and hierarchies which are readily available for detailed queries and analytics.
Central to the OLAP process are cubes, which are not cubes in the traditional sense, but are pre-determined hierarchical, multidimensional arrays that are stored in memory. Unlike relational databases that use two-dimensional data structures (often in the form of columns and rows in a spreadsheet), OLAP cubes are logical, multidimensional data structures that can have numerous dimensions and levels of data. A cube that includes more than three (3) dimensions is referred to as a hyper-cube. OLAP systems are built using a three-tier architecture, wherein the first or client tier provides a graphical user interface or other application, the second or middle tier provides a multidimensional view of the data, and the third or server tier comprises a relational database management system that stores the data.
Different cubes can be used for different types of data and queries. An OLAP cube stores data in a method that enables fast retrieval of summarized data. Data summarization in this context means condensing large amounts of data into meaningful numbers such as counts, sums, averages, or other statistical measures. The structure of a cube is hierarchical in nature and is derived from the associations between the different columns and rows of data in a data source. In theory, an OLAP cube is an abstract representation of a projection of a relational database management system relation that is pre-computed and cached for faster query servicing. OLAP cubes are comprised of dimensions, levels, hierarchies, members, and member properties. This structure enables you to easily select data subsets and navigate the cube structure when querying the cube.
Selecting data subsets can result in selecting a dimension that is not completely populated by statistical measures, which can lead to an increase in the number of unpopulated cells in the resulting cube (hereinafter “sparcity”). As the degree of sparsity in a cube increases the time required to query the cube increases.