Business decision makers often use business intelligence analytical software to pose operational performance questions as queries against their data sources. The basic capabilities of querying and reporting functions is extended by On-line Analytical Processing (OLAP), allowing a robust multidimensional understanding of the data from a variety of perspectives and hierarchies in a multidimensional database. OLAP operations such as drill-down, roll-up and pivot provide insights into business growth, spending, and sales patterns that would simply not be possible otherwise. Other OLAP functionality includes operations for ranking, moving averages, growth rates, statistical analysis, and “what if” scenarios.
Multidimensional databases intuitively view data as a multidimensional structural metaphor called a cube whose cells correspond to events that occurred in the business domain. Each event is quantified by a set of measures; each edge of the cube corresponds to a relevant dimension for analysis, typically associated to a hierarchy of attributes that further describe it. A multidimensional database may further comprise a collection of related cubes. Dimensions, such as an essential and distinguishing concept in multidimensional databases, are used for selecting and aggregating data at the desired level of detail.
However, the data to be analyzed often have up to 20 or more dimensions, making computations complex and costly. As the dimensions increase, and the number of members of each dimension increases, the number of cells increases dramatically. The number of cells in many cubes representing a business process in a medium or large company is often too large to provide a fast and efficient calculation.
Many of the cells in a cube are interconnected by formulas. Cells representing profit, for example, are calculated by the difference between corresponding cells representing revenue and corresponding cells representing expenses. Cells representing a year are computed as the sum of corresponding cells representing months which in turn are computed as the sum of corresponding cells representing days.
Often there is null or zero data interspersed throughout important data, resulting in sparsity of relevant data in the database. For example, no sale was made on a given day of a given product by a given salesperson to a given customer by a given sales channel. The business intelligence analysis activity typically involves the creation and manipulation of a cross-tabulation, or cross-tab. The sparsity of data makes it difficult to derive useful information when presenting data in a cross-tab, due to the relationship-based nature of multidimensional querying, analysis and reporting.
The sparsity of data may also be a result of a user creating a new cross-tab. Many common user scenarios may lead to cross-tabs with empty result sets. Removing sparse data from cross-tab views is therefore an important way to improve the readability of the user's data.
The business intelligence analytical tool is often implemented in a server-client architecture. The analytical tool often includes a client that provides a user interface, for example, a client browser; and a server that handles processing of queries and retrieval of data from data sources. Therefore, both accessing the desired cells with sparsity in the vast array of data in a database, and returned result data with sparsity can add significantly to the time taken to process a query.
There have been different approaches to remove the sparsity in the data. One is a brute-force zero-suppression by removing all cells with null or zero value in query result individually. This approach, while easy to execute, does not provide any selectivity when the null or zero data are removed, thereby suffering from poor performance for any significantly sized multi-dimensional data source.
Another approach is based on filtering member sets based on data value criteria. This approach may be tedious and error-prone as it requires in-depth knowledge of the data and intensive user-interaction to define appropriate filters, resulting in poor usability.
Yet another approach is described in U.S. Pat. 6,728,697 to Leathers, which is hereby incorporated by reference in its entirety. U.S. Pat. 6,728,697 teaches a mechanism to partition the processing work in such a way that the processing can be carried out by the OLAP data source, thereby improving the efficiency of the delegated portion of the operations, and making more effective use of the overall processing power of the computing facilities. Rows and columns of the original cross-tabulation (cross-tab) formed from the data are classified as either simple, or complex, depending whether the data can be computed at the OLAP data source. Accordingly, zero suppression may be carried out through an a priori preliminary determination of which rows and columns contain only zero, null or other non-relevant values, before the data is requested. A Boolean may be used to indicate whether a row has any zero or null data, and to condition subsequent operations.
There is therefore a need to query a multidimensional database in a single step, using sparse data removal criteria to determine whether certain data is returned in the query result, thereby enhancing the view of relevant data, and making the decision-making process more effective and timely.
It is further desirable to provide a mechanism for providing business intelligence information from the server to the client more efficiently, more specifically, there is a need to provide an efficient result set from a server to a client when responding to a query from the client.