Databases are used to store information. The data in a database may be stored in several tables, with each table storing a subset of the information in the database. Each table in the database is divided into one or more rows, with each row containing related information and/or one or more pointers to information in one or more different tables. The pointers, as well as the information to which the pointer points, are known as "keys." The other information in a row is referred to as the "attributes" of the row. Attributes are sometimes referred to as "columns" of the table.
One example of a database table is a customer table containing information about the customers of a business. The customer table might contain one row for each customer of a business, with the attributes describing the name, address, birth date and marital status of each customer.
A conventional "dimensional database" is a database made of three or more tables arranged in a particular manner. A dimensional database contains a fact table and two or more dimension tables, with each row in the fact table pointing to one or more rows in one or more dimension tables. The fact table may contain the data that is expected to change most rapidly, with the dimension tables storing data that changes less rapidly. For example, in a database that records orders for products, an order described in each row of the fact table might contain a pointer to the customer row in the customer table corresponding to the customer who placed the order. Each row in the fact table may also contain a key to a row in a product dimension table corresponding to the product in the order. Each row in the fact table may also contain a key to a row in a store dimension table corresponding to the store at which the order was placed. Additionally, each row in the fact table may contain a key to a row in a time dimension table corresponding to the date and time the order was placed. The "fact" recorded by the fact table is that an order was placed for a specific customer containing a specified product at a particular store at a stated time.
Referring now to FIG. 1A, such a dimensional database is illustrated. Fact table 100 stores fact records 102, each containing a time key, a product key, a customer key, and a store key described below. In addition, attributes containing the number of units of the product ordered are stored in the fact row 102.
A customer dimension table 120 stores one or more customer rows 122, containing information about each of the customers that may place orders for products using the fact table 100. Store dimension table 140 holds one or more store rows 142, containing information about the stores that may take orders for products. Product dimension table 130 contains one or more product rows 132, which hold information about each of the products that may be ordered. Each of the rows of the tables 120, 130, 140 contains a computer-generated integer key, or other form of key, unique within the table 120, 130, 140. To point to a row of one of these tables 120, 130, 140, the corresponding key in the fact table is set to the value of the key of the row.
Time dimension table 110 stores one or more time rows 112, each of which contains a key and an identifier of the date and time to which the index corresponds. For example, each time row 112 may contain a unique, computer-generated integer key or other form of key, and a date and time to a precision of one minute. The time dimension table 110 may be set up in advance to contain rows 112 corresponding to all possible values of time past, present, and future for a limited amount of time, such as a 10 year period.
The time key, product key, customer key, and store key in each row 102 of the fact table 100 are pointers to a row 112, 122, 132, 142 in the time dimension table 110, customer dimension table 120, product dimension table 130 or store dimension table 140, respectively.
A dimensional database may be implemented using a conventional relational database program such as the Oracle7.3 product commercially available from Oracle Corporation of Redwood Shores, Calif. or the Microsoft Access product commercially available from Microsoft Corporation of Redmond, Wash. View or Virtual databases may also be used, treating several databases as if they were a single database. Conventional relational databases with specialized tools for On-Line Analytical Processing, or OLAP-optimized databases may also be used. Such databases are sometimes referred to as MOLAP, ROLAP or DOLAP databases and are described at http://www.sentrytech.com/dw05dem.htm. Non-database implementations such as those storing data using objects, records, arrays or flat files may be used to implement dimensional databases. Keys may be implemented using conventional pointers or look-up table approaches.
A database program may generate some of the keys in fact row 102 automatically, with other keys selected by the person or program operating the database program. For example, when an order is placed, a new fact row 102 is generated in the fact table 100 by a person using, or a computer program interacting with, a database program. At this time, the time key may be selected by locating a time row in time dimension table 110 that most closely matches the system clock of the computer system on which the database program is running. The database program will write the index of this time row 112 to the time key in the fact row 102. The store key may be pre-filled in to correspond to the store row 142 in the store dimension table 140 corresponding to the store at which the order is being placed. The person or program interacting with the database program adds the customer key to the fact row by selecting a row in the customer dimension table 120 corresponding to a name or customer number provided by the customer.
The person or program then selects a product from the product dimension table 130 and signals the database program to enter the corresponding product key of the corresponding product row 132. The person or program then selects the number of units for the product designated by a product key which is stored in the fact row, thereby completing the order in the fact table 100.
Conventional dimension tables 110, 120, 130, 140 may change quite slowly in relation to the fact table 100. For example, a customer row 122 may contain fields for marital status and address. The customer table 120 may only change when a customer is added, or an existing customer changes his or her name, marital status or address, while many orders are expected each hour in the fact table 100.
Conventional dimensional databases may be implemented with a temporal fact table 100. A "temporal table" is a table that maintains historical information in addition to current information.
Unlike the fact table 100, many conventional dimensional database dimension tables 120, 130, 140 are not temporal tables. If a customer changes his or her address or gets married or divorced, the old information in the customer dimension table is replaced with current information. Because the dimension tables 120, 130, 140 are not temporal, changes to the data in these tables 120, 130, 140 can produce inaccurate results when the database is later queried. For example, if a customer changes his or her marital status, the database program may simply update the customer row 122 for that customer in the customer dimension table 120, deleting the old value of the marital status attribute and replacing it with the new value. Because the customer's marital status is not preserved over time, queries to the dimensional database 100 that depend on time of a change or an effective date of a customer's marital status will be inaccurate. For example, if a customer places an order while the customer is single, and then changes his or her marital status to "married", querying the fact table 100 for orders placed by single customers will omit the fact row 102 corresponding to the order placed when the customer was single.
Several solutions have been attempted to record changes in data related to the dimension tables 120, 130, 140 without causing those tables to become excessively large in size. For example, another row in the customer dimension table 120 may be added each time a change is made to the information for customer. Subsequent orders made by the customer point to the new customer row 122. However, such solutions have not recorded the time of the change. Because only the rows of the fact table 100 record any indication of when a change is made, when a query is performed on the database, the first order to be placed by that customer after the change was made may erroneously be identified as the best indication of the time of that change. Such inaccuracy prevents accurate queries from being made to the database, such as a list of all orders placed within one month of a change of a marital status.
Another attempted solution is to add storage for an old value of an attribute for some or all of the attributes stored by each row 122, 132, 142 in the dimension tables 120, 130, 140. Such an arrangement allows an attribute to be updated, while preserving the old value of the attribute. However, this attempted solution does not store more than one older value of each attribute, does not store the time of the change and may require re-programming of the applications accessing the data in the database to implement. Such an attempted solution will not allow queries to be made to the database that depend on the time of the change or effective date of a value of an attribute.
A system and method are therefore necessary that can accurately select records from a dimensional database that depend on a time of a change or effective date of a value of an attribute in a table of a dimensional database.