1. Technical Field
The present invention relates in general to improvements in data processing systems and in particular to improvements in relational databases within a data processing system. Still more particularly, the present invention relates to methods for maintaining referential integrity within a relational database despite the creation or alteration of a data table within that database.
2. Background Art
One clear advantage in the utilization of data processing systems is the systematic manipulation and utilization of large quantities of data. Typically, these large quantities are arranged in an orderly manner in a so-called database which may be rapidly accessed either directly or indirectly by any processor within the data processing system.
More recently, the utilization of database management systems has reached a new level of sophistication with the advent of the relational database. A relational database is a database composed of multiple data tables. The relational model which forms the basis for such a database includes three major subsections. A structural subsection which addresses the structure a relational database must conform to, a data manipulation subsection which governs the manner in which data must be manipulated within a relational database and a referential integrity subsection which controls the manner in which changes in one area of the database affect the remainder of the database.
Referential integrity may be achieved by adhering to selected constraints which, while difficult to implement, are well known in the literature. For example, the major referential constraint is the assertion that non-null values of a designated "foreign key" are valid only if they also appear as values of the "primary key" of a designated data table. In the terminology of relational databases, the "parent" data table is the data table in a relationship of two data tables which contains the designated primary key. Similarly, the "dependent" data table is the data table in the aforementioned relationship which contains the foreign key.
It should be clear upon reflection that a single data table may be both a parent table and a dependent table in an arbitrary number of relationships. A data table may be said to be a "descendant" of a parent data table if it is a dependent of that data table or a descendant of a dependent of the parent data table. Conversely, a data table is said to be an "ancestor" of a second data table if it is a parent of that table or a parent of a table which includes the second table as a descendant. Also, a data table may be a descendant and an ancestor of itself if it is interconnected in a so-called "cycle" with multiple other data tables. A data table may also be both a parent data table and a dependent data table for a single constraint in a so-called "self-referencing" table.
In order to control the referential integrity of a relational database it is necessary to associate a relationship rule with each referential constraint. The three activities which must be addressed in such relationships are the results of an update, an insertion or a deletion operation. In view of the relational nature of the data tables in such a database it should be obvious that any one of these three activities in one data table may have an effect on another data table within the database. Thus, a series of rules is generally associated with each such activity.
The rule governing insertions is quite simple. A primary key may be inserted; however, a foreign key may only be inserted if it meets the basic referential constraint. That is, a non-null value of a foreign key may only be inserted if it also appears as a value of the primary key. Similarly, updates to a foreign key are only possible if the new foreign key also appears as a valid primary key. In the depicted embodiment of the present method, updates to a primary key are "Restricted," that is, they are only permitted where that primary key does not have any dependants. Of course, a system could be implemented such that updates to a primary key will automatically be implemented in each dependent key in a so-called "Cascade" mode of operation, or will cause each dependent key to be "Set Null."
The delete operation in the depicted embodiment of the present invention will utilize all three of the aforementioned relationship rules. Therefore, it is possible to have a delete rule of "Restrict," "Cascade" or "Set Null." In this manner, the effect of a deletion anywhere in the database is clearly defined in terms of its effect on dependent data tables. In the terminology of relational databases, any data table which is involved in a delete operation with a second data table is said to be "delete connected" to the second data table.
Two additional restrictions are generally applied to the relationships among a set of data tables with respect to delete connections. Firstly, a table may not be delete connected to itself in a cycle of two or more tables. Secondly, if a data table is delete connected to another data table through multiple paths, the relationship rules governing the delete operations must be identical and must not be Set Null.
Thus, any attempted modification to the structure of a relational database must carefully traverse each of these relationship rules to maintain referential integrity. The large numbers of related data tables contained within a relational database and the multiple relationships which may exist between those tables make the verification of the relationships after an attempted modification to the structure of the database quite difficult. It should therefore be obvious that a need exists for a method which may be utilized to efficiently and rapidly ascertain whether or not an attempted modification to the structure of a relational database will allow consistent referential integrity.