Data is often organized into tables that are divided into rows and columns. Any given piece or set of data may be associated with one or more dimensions. In the context of database systems, a “dimension” is a list of values that provide categories for data. A dimension acts as an index for identifying values of a variable. For example, if sales data has a separate sales figure for each month, then the sales data has a TIME dimension. That is, the data is organized by time. Similarly, if separate sales values are stored for each product, then the sales data has a PRODUCT dimension.
Some of the columns of a table may correspond to dimensions, while others may represent the measures, which are quantities of interest. For example, a sales table may have a date column, a product identification column, and a location column for respectively storing values associated with the TIME, PRODUCT, and LOCATION dimensions. In addition, the sales table may include columns for storing various measures, such as the number of products sold, the price of the products, and the discounts offered.
FIG. 1A shows a Table 1A that includes one column for each of the TIME, PRODUCT, and LOCATION dimensions. Specifically, Col. 1, Col. 2, and Col. 3 of Table 1A correspond to dimension keys. In addition to the dimension columns, Table 1A includes Col. 4, Col. 5, and Col. 6, which store measures. In FIG. 1A, the Row ID column is not part of Table 1A, but has been placed in FIG. 1A to simplify explanations that appear later in this application.
Table 1A illustrates a simplified case in which there are only two products and two locations. Table 1A shows only Date IDs 1-7. There are several combinations of dimension values for which Table 1A does not have any data. For example, in Table 1A there are no entries for the Date ID, Location ID, Product ID tuples of (2,2,1), (2,2,2), and (3,2,2). Table 1A also does not have any entries having Date IDs 4 or 5. The rows of Table 1A have no particular order. The location of rows within Table 1A may be determined by space management considerations, and may differ depending on the order in which the various rows and columns have been recorded and updated, for example.
FIGS. 1B-D illustrate block diagrams of examples of dimension tables. Each dimension column of Table 1A is associated with a dimension table, giving further information associated with the various values of the dimension. Typically, there is a different table for each dimension. To improve access to the dimension tables, bitmap indexes or B-tree indexes (not shown) may be built on the columns. The dimension columns of Table 1A contain references to rows in the individual dimension tables. The individual dimension tables provide a translation between the reference or identification number used in the six-column fact table, and the names more commonly used for the reference numbers. The rows of Tables 1B-1D would not be stored in any particular order. Nonetheless, the rows of FIGS. 1B-1D are illustrated in numerical order so that it is easier to follow the discussion in the remainder of the application.
FIG. 1B shows a Table 1B, which is the dimension table for time. Table 1B has three columns for Time ID, Date, and Day of the Week. Table 1B gives a translation between the Time ID and the calendar date. Additionally, the day of the week column gives the day of the week that corresponds to the date in the Date column. Although not illustrated in this example, the Time table may have any number of other columns giving other information about each day, such as whether the day is a federal or company holiday in addition to, or instead of, any of the columns in FIG. 1B.
As can be seen from Table 1B, Date ID 1 corresponds to Jan. 1, 2003, and Date IDs 4 and 5 correspond to Saturday and Sunday, respectively. In this example, the reason there are no entries having Date IDs 1, 4, and 5, is because the business represented by Table 1A was closed on Jan. 1, 2003 and is normally closed on Saturdays and Sundays.
FIG. 1C shows Table 1C, which is an example of a Product table. Table 1C includes a Product ID column, a product name column, and a description column. The Product ID column gives the Product ID used in Table 1A for the product named in the product name column, and thereby provides a translation between the Product ID and product name. In this example, the description column provides a further description about the product.
FIG. ID shows Table 1D, which shows an example of a Location table. Table 1D includes a Location ID column, a location name column, an operation info column, and a street address column. The Location ID column gives the Location ID used in Table 1A for the location named in the location name column, and thereby provides a translation between the Location ID and location name. In this example, the Operation Info column provides information specific to the operations of the location of that row. Specifically, in Table 1D the Operation Info column indicates that Location 2 is closed on January 2, and consequently there are no entries for Location 2 on Jan. 2, 2003 in Table 1D (therefore, as mentioned above, Table 1A does not have any rows corresponding to tuples (2,2,1) or (2,2,2)). The Street Address column provides the street address of the location of the same row. Based on Tables 1B-1D and the absence of an entry in Table 1A having tuple (3,2,2), it can be deduced that there were no sales on Product 2 at Location 2 on Jan. 3, 2003.
It is desirable to access the table efficiently (quickly and/or with a minimal amount of computing). Searching for non-existent rows may add to the time required to find data, and may thereby contribute to inefficiencies.