This invention relates to computer databases and minimizing changes needed to databases when updating is required, and keeping database information consistent and preventing redundancies.
Primary keys have a dual role in prior art databases. All prior art primary keys defined for database reference tables are used to uniquely identify each instance of reference data (so that it can be accessed by a computer processor) and as the basis for linking the data contained in multiple database tables. This dual role for the primary key is a major problem. With prior art primary keys there is not enough isolation between the unique identification of reference data and the reference data""s association to non-reference data. Therefore the reference data can not be changed independently without impacting the non-reference data as well.
It is difficult to remove redundant reference data, to merge reference data records, or to integrate data for multiple databases with these prior art dual purpose primary keys.
Generally prior art database maintenance methods were usually performed in an ad-hoc manner. That is they were not designed in advance. The database manager would simply do what he felt was appropriate to complete a task with no regard for any tasks that may be required in the future or as a result of any errors that were made in performing the original task.
For definition purposes in this application and as would be known to those skilled in the art a database table is a predefined data structure comprised of multiple predefined data fields or table columns. Each predefined data column has attributes associated with it such as the data type (character string, number, date, etc.) the data field length, the optionally, and more. Each row of the database table is a data record composed from the data field values. A database table many contain just a few data records or many thousands and sometimes millions of data records. As the number of data records increases, it becomes important to be able to access or find specific data records or groups of data records as quickly as possible. For this purpose we define xe2x80x9ckeysxe2x80x9d. A database table may contain many xe2x80x9ckeysxe2x80x9d and each xe2x80x9ckeyxe2x80x9d is comprised of one or more data fields. There are several types of xe2x80x9ckeysxe2x80x9d that have various purposes. Almost every database table has a primary key declared which is used to uniquely identify each and every data record. In addition, alternate keys may be declared to aid in the access of data records. These alternate keys are often used to uniquely identify data records as well. The primary key, however, has the additional distinction that other database tables often inherit it. When a database table inherits a primary key, that table receives a copy of the inherited primary key""s data fields as part of its predefined data structure. This inherited primary key is called a foreign key in the table or tables that inherit it. The foreign key may then be unique or non-unique within a table.
From the database table point-of-view, the keys are predefined data structures. From the data value point-of-view, the keys are values that occupy these predefined data structures. It is the values that actually allow us to link data records. From the point-of-view of the data structure we refer to key columns or key attributes. From the point-of-view of the data values we refer to the key values.
The present invention overcomes the prior art primary key data-isolation deficiency which impacts one""s ability to maintain one""s data and thereby improve the quality of the data. The present invention provides xe2x80x9cpaired keysxe2x80x9d to overcome and hopefully eliminate this data-isolation deficiency. Methods are provided which allow data to be reversed back to its original state. In other words, if a data record is modified the method allows a processor to undo the modification and return the data record to its original state.
The present invention provides for the following new data stewardship methods as a result of using paired keys to identify reference data:
Adding/Removing of paired keys
Transforming/Interpreting of paired keys
Declare Duplicate/Declare Unique Data Records
Merge/Split Data Records
Populate/Destroy Paired Keys
Isolate/integrate Data Records
Activate/Inactivate a Data Record
The present invention in one embodiment provides a plurality of data records, each data record having paired keys comprised of a first key and a second key. The first key is the declared primary key attributes of the data records. The purpose of the primary key in some embodiments is to uniquely identify links from the data records in the reference database table to data records in related database tables or to other data records in this reference database table or other reference database tables. The second key is comprised of one or more data fields that are used to locate the reference data record that should be associated to the link. Since the first key is the existing primary key of the reference table, it inherently links the original reference data record to the non-reference data.
This reference data pointed to by the second key may exist in the same table, a different table (via a non-identifying relationship) or even a different table in a different database. In the present invention, in some embodiments the reference data has been totally isolated from the non-reference data. This isolation means that modifications to be performed on reference data do not require changes to the non-reference data.
The present invention in one embodiment provides an apparatus comprising a processor and a computer memory. A first table of reference data records are stored in the computer memory, each reference data record comprised of a field of a first type, a field of a second type, and a field of a third type. In addition a first table of related data records are stored in memory, each related data record comprised of a field of a fourth type and a field of a fifth type. The fields of the first, second, third, fourth, and fifth type contain first, second, third, fourth, and fifth types of data, respectively. The first type of data in the fields of the first type are used by the processor to access the reference data records and this first type of data may be called a primary key. The fourth type of data of the related data records may be the same as the first type of data of the reference data records. Each instance of the first type of data in each field of the first type are used by the processor to access related data records which are linked to the particular reference data record and this first type of data may be called a primary key value (the xe2x80x9cprimary keyxe2x80x9d being a collection of primary key values).
The present invention may include in one embodiment a plurality of further tables of a plurality of further related data records stored in memory, each related data record comprised of a field of a fourth type and a field of a sixth type. The fields of sixth type may contain a sixth type of data. The fourth type of data of the plurality of further related data records may be the same as the first type of data of the reference data records. Each instance of the first type of data in each field of the first type may be used by the processor to access related data records which are linked to the particular reference data record.
The first type of data and the fourth type of data may be a customer identification number. An additional type of data may be provided to provide status information regarding particular reference data records, specify whether the particular reference data record is active or inactive, and/or specify whether a particular reference data record is a duplicate of another reference data record.
In one embodiment of the present invention a method is provided where fields of a third type and of a fourth type are added to each of a plurality of reference data records. The fields of a third type may be used as an alternate key to access each reference data record and the fields of a fourth type may be used as a second key to access a particular reference data record.
The present invention in some embodiments improves and maintains the quality of the data contained in the database. The present invention provides a new data architecture for making data consistent across databases without the need to integrate data from multiple databases into a single database.
The paired keys in accordance with the present invention may be added to any existing database reference table. This augments the functionality of the table as well as totally isolating the reference data from its related data.