The present application relates generally to multidimensional databases and more specifically to mechanisms for processing updates to data values at varying levels in the dimensional hierarchy.
Multidimensional databases are typically used by business enterprises to provide rapid, ad-hoc reporting on metrics such as financial performance or customer behavior. Broadly, such applications are referred to as online analytical processing (OLAP), business intelligence, data warehousing, or data mining.
Information in a multidimensional database is stored in a cube containing one or more numerical measure attribute dimensions associated with a set of feature attribute dimensions. For example, the measure attribute of sales might be stored with the feature attributes of date and location.
Many architectures exist for managing multidimensional databases. In multidimensional online analytics processing (MOLAP), data is stored in an optimized multidimensional array. This approach typically trades off extra time required to load data for reduced memory requirements and faster analytical query performance. In contrast to MOLAP, the relational online analytics processing (ROLAP) architecture stores data in relational database management system tables. The most common physical structures for storing data are in a star or snowflake schema with a central fact table containing columns for measure attribute and feature attribute dimensions. The columns in the fact table reference members that are uniquely identified in related dimension tables.
In nearly all instances, multidimensional databases are used to analyze information drawn from multiple data sources. As such, updates occur at regular, periodic intervals using software and systems that extract, transform, and load data from external databases. Numerical values are loaded for the lowest level, most granular, information in the database. Those skilled in the art refer to this most granular information as atomic cells. OLAP systems also optionally pre-calculate aggregate values associated with higher levels in the dimension hierarchies. In other words, OLAP systems are typically bottom-up models since data is loaded, stored, and updated at the most granular level and summaries are calculated by aggregating atomic cell values.
There has been fertile innovation in the field of updating data in multidimensional databases. One area of innovation has focused on methods to accelerate the update of granular atomic cell values and the corresponding recalculation of aggregate values. Another area of innovation has focused on allowing multidimensional databases to dynamically add dimensions. In both areas, it is assumed that data must be loaded and updated at the atomic cell level.
Although conventional multidimensional database systems can ease the task of managing data loaded and updated at the atomic cell level, it is desirable to provide a system that allows processing of update requests where the values received represent higher levels of aggregation in the dimensional hierarchy.
A common method used to process update requests where the values received represent higher levels of aggregation is to scale or pro-rate atomic cell values by the percentage increase or decrease of their parent value. The extension of this method to updating more than one higher level value update simultaneously is known to those skilled in the art as iterative proportional fitting. Iterative proportional fitting merely applies successive waves of pro-rating until actual values converge to desired values. The disadvantages of iterative proportional fitting include the possibility that the algorithm will not converge, the algorithm's computational overhead, and the preservation of any atomic values that originally equal zero.