In the context of database systems, a “dimension” is a list of values that provide categories for data. A dimension acts as an index for identifying values of a variable. For example, if sales data has a separate sales figure for each month, then the data has a MONTH dimension. That is, the data is organized by month. A dimension is similar to a key in a relational database. Data that is organized by two or more dimensions is referred to as “multidimensional data”.
Any item of data within a multidimensional measure can be uniquely and completely selected by specifying one member from each of the measure's dimensions. For example, if a sales measure is dimensioned by MONTH, PRODUCT, and MARKET, specifying “January” for the MONTH dimension, “Stereos” for the PRODUCT dimension, and “Eastern Region” for the MARKET dimension uniquely specifies a single value of the measure. Thus, dimensions offer a concise and intuitive way of organizing and selecting data for retrieval, updating, and performing calculations.
Multidimensional data may be stored in relational database systems (“ROLAP” systems) or in specialized, “multidimensional” database systems (“MOLAP” systems). Multidimensional database systems provide structures and access techniques specifically designed for multidimensional data, and therefore provide relatively efficient storage and access to multidimensional data. However, when stored in specialized multidimensional database systems, only applications that are specially built to interact with those multidimensional database systems are able to access and manipulate the data.
On the other hand, when stored in relational database systems, all applications that support interaction with relational databases have access to the data. Such database applications communicate with the relational database system by submitting commands that conform to the database language supported by the relational database system, the most common of which is the Structured Query Language (SQL).
Relational database systems store data in the form of related tables, where each table has one or more columns and zero or more rows. The conventional mechanism for storing multidimensional data in a relational database system is to store the data in tables arranged in what is referred to as a star schema. In relational database systems, a star schema is distinguished by the presence of one or more fact tables and one or more dimension tables. Fact tables store measures, and contain foreign keys to dimension tables. Dimension tables store values for hierarchical dimensions. FIG. 1 illustrates an exemplary star schema with two dimensions.
Referring to FIG. 1, it illustrates a database 100 that includes tables 102, 104 and 106. Table 102 is named “store” and contains information about each of the stores in which sales may occur. Each row in store table 102 contains a unique store-id and information about the particular store that corresponds to the store-id. Table 104 is named “product” and contains information about each type of product that may be sold in any of the stores. Each row in product table 104 contains a unique product-id and information about the particular product.
Table 106 is named “sale” and contains information about each sale in each of the stores represented in the store table 102. Each row in sale table 106 includes a dollar amount, a store-id to indicate the store at which the sale was made, a product-id to indicate the product sold in the sale, and the date of the sale. Typically, the number of sales will be vastly greater than both the number of stores at which the sales are made and the number of products carried by the stores. Detailed information about the store and product involved in a sale transaction does not have to be stored in the rows of table 106 because such detailed information is available in tables 102 and 104, respectively. Instead, the rows of table 106 simply contain values (store-ids and product-ids) that reference information stored in the other tables 102 and 104. Therefore, tables 102, 104 and 106 constitute a star schema in which table 106 is the fact table and tables 102 and 104 are dimension tables.
The data stored in fact table 106 only has two dimensions, and therefore fact table 106 only has two columns dedicated to storing foreign key values for those dimensions. In general, a fact table must dedicate one column for storing foreign key values for each of the dimensions associated with the multidimensional data stored in the fact table. Thus, a fact table that stores data associated with twenty dimensions would have to dedicate twenty columns to the storage of foreign key values.
An alternative approach to managing multidimensional data in a relational database involves storing the data in relational files but maintaining all multidimensional structure, metadata, administration, and access control using multidimensional database system techniques. Accessing relationally-stored data using multidimensional techniques poses numerous difficulties. For example, when all administration and access to the multidimensional data are controlled exclusively through the multidimensional database system engine, two database management systems must be administered. Further, database applications that access data using conventional relational commands (e.g. SQL commands) are unable to access the multidimensional data.
The approaches described above for storing multidimensional data in relational database systems demonstrate the tradeoffs made by prior approaches, which have either (1) sacrificed the benefits of multidimensional storage to enjoy the benefits of modern relational systems, such as conventional relational access, or (2) sacrificed the benefits of relational storage to attain the efficiency of multidimensional storage.