Analytical data services are a key part of many data warehouse and business analysis systems. Such an analytical data service may be, for example, MICROSOFT ANALYSIS SERVICES™ from Microsoft Corp. of Redmond, Wash. Analytical data services provide for fast analysis of multidimensional information. For this purpose, analytical data services provide for multidimensional access and navigation of data in an intuitive and natural way, providing a global view of data that can be drilled down into particular data of interest. Speed and response time are important attributes of analytical data services that allow users to browse and analyze data online in an efficient manner. Further, analytical data services typically provide analytical tools to rank, aggregate, and calculate lead and lag indicators for the data under analysis.
In this context, an analytical data services cube may be modeled according to a user's perception of the data. A cube may organize a data type according to dimensions, each dimension modeled according to an attribute of the data type. For example, a cube may organize “Balance” data according to the dimensions “Time” and “Location” (and possibly other dimensions). Dimension members act as indices for identifying a particular cell or range of cells within the cube. The cube may also have a number of measures, which measure a data type according to its attributes. For example, the cube may have a measure “Value”, which measures the value of a balance at a specified time in a specified location.
Analytical data services are often used to analytically model data that is stored in a relational data source such as, for example, an Online Transactional Processing (OLTP) database. Data stored in such a relational data source may be organized according to multiple tables. Each such table may organize a data type according to columns corresponding to attributes and measures. For example, the cube discussed above may be modeled according to a “Balance” table with columns corresponding to attributes “Time” and “Location” and measure “Value”.
Typically, there are a number of hierarchies associated with each dimension of a cube. Each such hierarchy includes levels of granularity. For example, the time dimension can consist of years subdivided into quarters subdivided into months subdivided into weeks subdivided into days. The years level is the broadest level of granularity, while the days level is the finest level of granularity. A common scenario with respect to analytical data services processing is that there is data present for a finer child level of granularity, but there is no data present for a broader parent level of granularity. For example, there may be data for the individual months of January, February, and March, but there may be no data for the overall first quarter. To calculate the data for a parent member, it is necessary to aggregate the data for the child members.
The aggregation of child members is performed according to an aggregation function. The most common aggregation function is a simple sum function (“SUM”), in which the entries for each of the child members are summed to calculate the value of the parent members. For example, if the value of balances in January is 3, 9, and 6 for the cities of San Francisco, Los Angeles, and San Diego, respectively, then the SUM value of balances in California for January is 18 (18=3+9+6). Other common aggregation functions include, for example, a minimum function (“MIN”), which provides a minimum value and a maximum function (“MAX”), which provides a maximum value.
In conventional analytical data services, aggregation is performed uniformly, meaning that every dimension in a cube is aggregated according to the same aggregation function. However, a common problem with respect to aggregation is that uniform aggregation is not always desirable. In fact, non-uniform aggregation is particularly desirable in business domains such as, for example, securities, account balances, budgets, and insurance policies and claims. Thus, it may be desirable for a cube to include a number of measures which are semi-additive, meaning that they are aggregated differently across different dimensions. Specifically, semi-additive measures are uniformly aggregated across additive dimensions and non-uniformly aggregated across non-additive dimensions.
A common non-additive dimension is the time dimension, because it is often useful to evaluate measures differently according to time than according to other attributes. For example, with respect to the balance data discussed above, the “Value” measure is cumulative with respect to location but is not cumulative with respect to time. Balance is not cumulative with respect to time because balance measures instantaneous rather than cumulative value. For example, a balance for a first quarter is derived from the balance at the end of March rather than from the sum of the balances for January, February, and March. Accordingly, for balance data, a parent time dimension member is equivalent to the value of its last child member.
Conventional analytical data services enable users to calculate non-additive dimensions of a cube through user defined aggregation. User defined aggregation enables the user to define logic using a proprietary language or a standard language such as multidimensional expressions language (MDX). Such logic expresses how a parent member is computed based in its child members. User defined aggregation is discussed in detail in A. Netz, “OLAP Services: Semiadditive Measures and Inventory Snapshots”, Apr. 1, 1999, which is hereby incorporated by reference in its entirety. A drawback of user-defined aggregation is that it requires a proficiency in a proprietary or standard language to define logic for aggregating a non-additive dimension. Another drawback is that logic must be defined separately for each non-additive dimension, which may be particularly tedious and time consuming for a cube that includes a number of non-additive dimensions. Thus, there is a need in the art for a simple interface that enables the user to select non-additive dimensions and to pair each such non-additive dimension with a pre-defined aggregation function.
Another common problem with respect to aggregation is that it is often desirable to aggregate different sets of data differently across a single non-additive dimension. For example, in addition to including data from the “Balances” table, the cube set forth above may also include data from an “Income” table. Unlike balance data, which is not cumulative with respect to time, income data is cumulative with respect to time. Thus, income data and balance data are aggregated differently across the time dimension. As conventional analytical data services are limited in this respect, there is a need in the art for systems and methods which enable different sets of data to be aggregated differently across a single non-additive dimension.