1. Technical Field
This invention relates to relational data base management systems, and more particularly to maintaining the logical consistency of data in relational data bases by means of referential integrity.
2. Description of the Prior Art
A data base management system is a computer system for recording and maintaining data. In a relational data base management system, data is stored in "tables" which can be viewed as having horizontal rows and vertical columns. Relational data bases were introduced by E. F. Codd in "A Relational Model of Data for Large Shared Data Banks", CACM, Vol. 13, No. 6 (June 1970), and expanded by E. F. Codd in "Extending the Database Relational Model to Capture More Meaning", ACM TODS, Vol. 4, No. 4 (Dec. 1979). The Database 2 product of the International Business Machines Corporation (IBM) is an example of a typical relational data base management system.
Within relational data bases, an important function is that of "referential integrity". Referential integrity ensures the consistency of data values between related columns of two different tables or of the same table. Required relationships between columns of tables are known as "referential constraints". A row in a "dependent table" possesses referential integrity with respect to a constraint if the value of its "foreign key" exists as the value of a "primary key" in some row of a "parent table", or if the value of its foreign key is null. In other words, every row in the dependent table must have a corresponding parent row in the parent table. If a dependent row's foreign key has no matching primary key value in the parent table, then that referential constraint is violated and there is a loss of referential integrity in the data base comprising those tables. To enforce referential constraints and thereby maintain the data base's referential integrity, the system must ensure that foreign key values always have corresponding primary key values. In implementations of referential integrity the system also ensures that primary key values are unique, a property known as "entity integrity". Referential integrity was explained by C. J. Date in "An Introduction to Database Systems", 3rd Edition Addison-Wesley Publishing Co. (1981).
In some implementations of referential integrity a "primary key index" (or "primary index") on a parent table's primary key is used to quickly locate primary key values in a table. Indexes are commonly used in data processing systems to quickly locate data rows. An index provides an ordered sequence for the rows of a table, the order being based on the values of an "index key". Primary indexes enforce the uniqueness of primary key values in the parent table by requiring that each value be unique in the index. Similarly, "foreign key indexes" on the foreign keys of dependent tables help to quickly locate particular foreign key values.
Referential constraints must be enforced whenever the data of the data base is manipulated so as to affect primary or foreign keys. In relational data base management systems which use the Structured Query Language (SQL), data is primarily manipulated by the LOAD, INSERT, UPDATE, and DELETE commands and their resulting operations. The LOAD and INSERT commands both add data to the data base, with LOAD typically adding many rows and INSERT adding only a few. UPDATE changes the contents of one or more rows of data, and DELETE deletes one or more rows. Whenever one of these operations occurs, the referential constraints involving the affected rows must be enforced to ensure the data base's referential integrity. The prior art contains a number of different methods for implementing referential integrity and enforcing referential constraints.
One prior art method for enforcing referential constraints first checks for constraint violations which would be caused by a pending manipulation (e.g., a load, insert, update, or delete operation), and then performs the manipulations if no constraint would be violated. These checking and manipulating steps are executed on a transaction-by-transaction or statement-by-statement basis as each transaction or statement is presented for execution. A single statement typically manipulates one or a few rows. A transaction typically consists of several statements, and therefore manipulates even more rows. In this prior art method, because the manipulation and enforcement phases are performed separately on the data for each statement or transaction, two passes through the data are required. Each pass through the data requires that each affected row be accessed and read, a time-consuming operation in most data processing systems. Performing two passes is quite inefficient, and slows the system's performance.
Another prior art method reverses the manipulation and enforcing phases described above, but still requires two passes through the data. This method defers constraint enforcement until after the data has been manipulated, and deletes or undoes all the manipulations if any constraint has been violated. Again, because the manipulation and enforcement phases are performed separately over several rows, processing time is increased and the system's performance suffers accordingly. There is thus a need for a method of enforcing referential constraints which uses only a single pass through the data.
A third prior art version of referential integrity incorporates paths or "links" representing constraints between a parent and its dependent records into the basic access path to the parent data. This method of "linked" referential constraints is typically implemented by using a chained list to go from a parent to all its dependents, or by using a B-tree rooted in the parent to point to all dependents. These linked methods suffer from several disadvantages. One is that the enforcement of such linked referential constraints requires special provisions for detecting and resolving self-referencing and cyclic constraints. Another is that constraints cannot be added to or deleted from existing data without modifying the links themselves, which typically requires restructuring the data. There is therefore a need for an efficient method of enforcing referential constraints which allows ready modification of the constraints without restructuring the data.