The increase in data storage and retrievable capabilities, together with advances in online analytical processing (OLAP) has resulted in unprecedented access to information. Typically, OLAP server products are either multidimensional OLAP (MOLAP) or relational OLAP (ROLAP). Both of these structures can store multidimensional information and have their respective and well known advantages and disadvantages.
In any database containing multidimensional data, ensuring that the data accessed is valid is a resource taxing activity. Typically, the database management system (DBMS) may check the validity of data when it is requested or alternatively may indicate the validity of data in advance, for example through the use of a flag.
Data in a database can be stored with a timestamp, indicating the last time when that piece of data was last written to. Data becomes invalid when any data that it is dependent on (its source data) is updated. Therefore, every time data is queried, either in itself or for use as part of a larger calculation, the DBMS may check the timestamp of all the source data and recalculate the data if necessary. A disadvantage of this method is that the number of database accesses is high, increasing the query or calculation time. An advantage of this method is that if data does not need to be recalculated, the calculation time is minimised. In today's environment when processing speeds have far outmatched IO speeds, this method of data validation may be inefficient, particularly if the source data regularly changes. Also, there is a minor increase in the database storage requirement due to having to store timestamps with the data.
Alternatively, whenever source data is updated, all data that is dependant on that source data may be either deleted or flagged as invalid, forcing recalculation of the dependent data if it is queried or used in a larger calculation. A disadvantage of this method is that during data load, large quantities of data must be invalidated, degrading data load performance. However, calculation and query performance is maximised, due to avoiding having to check all the source data of the calculated data prior to reading the calculated data.
Therefore, the approach selected for data validation depends on the nature of the data in the database. For constantly changing databases where calculation performance is not important, invalidation at query/calculation time may be preferred. If the time taken to load data is not important, invalidation at data load time may be preferred. This creates a problem for databases that may not fit into either of these generalisations, with the presently available options for cell validation having high associated efficiency degradation.
It is an object of the present invention to overcome or alleviate problems in management of multidimensional databases at present, or at least to provide the public with a useful alternative.
Further objects of the present invention may become apparent from the following description, given by way of example only.
Definitions
                Calculated Cell: A cell including at least one calculated member.        Calculated Member: A member whose value is dependent on one or more other members and/or a mathematical formula.        Cell: A location in a multidimensional database. A cell is a tuple of members.        Dimension: A set of hierarchically related members.        Input-level cell: A cell whose location contains only members that are not dependent on other members.        Member: A unique position on a dimension that includes in itself or points to data.        OLAP: On-Line Analytical Processing. A category of applications and technologies that allow the collection, storage, manipulation and investigation of multidimensional data.        OLAP Server: An application that provides OLAP functionality over a multidimensional database.        Outline: The set of all dimensions in a multidimensional database.        Source cell: A cell including at least one source member.        Source member: A member on which another member (a calculated member) is dependant.        
Throughout this specification, data in the multidimensional database has been referred to by reference to members and cells. However, this terminology is not intended to limit the scope of the invention to any particular data format in a multidimensional database.
Unless the context clearly requires otherwise, throughout the description and the claims, the words “include”, “including”, and the like, are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense, that is to say, in the sense of “including, but not limited to”.