1. Field of the Invention
The present invention relates to the field of computer programming, and more particularly to a method, system, and computer readable code for increasing the efficiency of operations that are to be performed on a relational database by programmatically ordering the operations according to the referential integrity constraints defined for the affected tables.
2. Description of the Related Art
xe2x80x9cReferential integrityxe2x80x9d refers to a property of relational databases wherein the consistency among tables of a database is maintained. In particular, maintaining referential integrity requires that all foreign key values in the rows of each table are valid. A foreign key is valid if its value either (1) is null, or (2) appears as a primary key in some row of an identified table. The table in which the primary key is contained is often called the xe2x80x9cparentxe2x80x9d table, and the table in which the foreign key is contained is called the xe2x80x9cdependentxe2x80x9d table. A particular table may have no foreign keys, in which case there are no referential integrity constraints to be enforced for this table, or the table may have one or more foreign keys. When a table has multiple foreign keys, then there are multiple parent/dependent table relationships and a referential integrity constraint must be enforced for each such relationship.
Referential integrity constraints must be enforced when operations are performed on the database to insert or delete rows of tables and when update operations affect foreign key values. The insert constraint requires that each foreign key value in a row being inserted must already appear as a primary key value in the corresponding parent table. For example, suppose a database has an Employee table and a Department table, and that each row of the Employee table has a column entry identifying the department in which that employee works. The department column may be defined as a foreign key in the Employee table, where the corresponding primary key is a department number in the Department table. In this example, maintaining referential integrity requires that a department is created in the Department table before employees can be defined for that department in the Employee table. The delete constraint requires that the primary key of a row to be deleted cannot be specified as the foreign key value of one or more rows in any dependent tables. Using the same Employee and Department tables as an example, if a department is to be deleted from the Department table, maintaining referential integrity requires that all Employee rows referring to the department number of this department are deleted first (or that the delete operation is prevented). The update constraint requires that an update to a foreign key value must use a new value that already appears as a primary key in the parent table (or that the update sets the foreign key value to null).
Thus, it can be seen that the order in which operations are performed against a relational database is critical to maintaining the referential integrity of the database. Many database systems are commercially available which provide detection of potential referential integrity constraint violations, and prevent the operations which would lead to an inconsistent database from being performed. (An invalid insert operation, for example, may generate an error message indicating that it contains a foreign key value which has not yet been used as a primary key value.) The DATABASE 2# product, commonly referred to as DB/2(copyright), from the International Business Machines Corporation (IBM) is an early example of this type of database system. (xe2x80x9cDATABASE 2xe2x80x9d is a trademark, and xe2x80x9cDB/2xe2x80x9d is a registered trademark, of IBM.)
The order in which operations must be performed to avoid referential integrity constraint violations typically does not match the order in which relational rows are created or deleted within application programs. Therefore, the order in which the rows need to be inserted or deleted from the database cannot be deduced from the application""s event flow. Furthermore, when a user is providing information for creating or deleting rows interactively, requiring the user to specify the information in the proper order for maintaining referential integrity places an unreasonable burden on the user of understanding the (possibly complex) table interrelationships and requires him to work in an inflexible, non-intuitive manner.
U.S. Pat. No. 6,456,995 (Ser. No. 09/224,427, filed Dec. 31, 1998), which is titled xe2x80x9cSystems, Methods and Computer Program Products for Ordering Objects Corresponding to Database Operations that are Performed on a Relational Database Upon Completion of a Transaction by an Object-Oriented Transaction Systemxe2x80x9d and which is assigned to the same assignee as the present invention, defines a technique for programmatically ordering the operations to be performed on relational databases in order to avoid referential integrity constraint violations. The technique defined in this referenced invention orders operations at the object level, for object-oriented programming applications, using associations between objects.
It would be advantageous to have a technique for programmatically ordering operations that can be used with applications which are not object-oriented, and/or which can be used outside the scope of the application generating the changes (such as at a relational database engine, upon receiving non-ordered operations from an application), thereby avoiding change to the application itself. Without automatic database operation ordering, programmers need to either manually code the operation ordering into the application program which generates the operations, or disable the referential integrity rule enforcement in the database. Manually coding the operation ordering is time-consuming and error prone, and easily leads to unmaintainable code. Disabling the referential integrity rule checking may lead to an inconsistent database.
Accordingly, what is need is an improved technique for automatically ordering operations to be performed on a relational database.
An object of the present invention is to provide a technique for programmatically ordering operations that are to be performed on a relational database in order to avoid referential integrity constraint violations.
A further object of the present invention is to provide this ordering technique such that the efficiency of relational database operations is increased.
Another object of the present invention is to provide this programmatic ordering technique where the ordering uses the referential integrity constraints defined for the affected tables.
Still another object of the present invention is to provide a programmatic ordering technique that enables batch mode processing, whereby multiple relational database store operations can be performed at a time, thereby reducing the number of round-trips between the application and the database engine and thus increasing the efficiency of the overall system.
Yet another object of the present invention is to provide a programmatic ordering technique that is not limited to use with object-oriented applications.
Another object of the present invention is to provide a programmatic ordering technique that can be used outside the scope of the application program which generates the database modifying operations, thereby enabling the application itself to remain unchanged.
Other objects and advantages of the present invention will be set forth in part in the description and in the drawings which follow and, in part, will be obvious from the description or may be learned by practice of the invention.
To achieve the foregoing objects, and in accordance with the purpose of the invention as broadly described herein, the present invention provides a method, system, and computer program product for increasing the efficiency of operations to be performed against a relational database while avoiding referential integrity constraint violations. This is achieved by using a technique which programmatically orders the operations to be performed. The technique comprises: determining an insert order among a plurality of tables of the database; determining a delete order among this plurality of tables; clustering a plurality of modifications to be made to the database according to an operation to be performed by each of the modifications and according to a particular one of the tables which is to be affected by each of the modifications; and applying the clustered modifications to the database. Applying the clustered modifications further comprises: first applying all of the clustered modifications for which the operation is insert, according to the insert order; next applying all of the clustered modifications for which the operation is update; and last applying all of the clustered modifications for which the operation is delete, according to the delete order. The clustered modifications for which the operation is update may be applied in an arbitrary order.
Determination of the insert order and of the delete order may be performed at a time prior to the clustering and applying of modifications.
This technique may operate at a relational database engine. It may also operate separately from an application program which generates the modifications.
The clustering and applying of modifications may be performed repeatedly for each of a subsequent plurality of modifications. Application of the modifications may be performed in a batch-write mode.
Determining the insert order may further comprise: identifying, for each selected one of the tables, one or more related tables; determining, among the selected table and each of the related tables, an insert precedence; and ordering the selected table before the related tables for which the selected table has insert precedence, and ordering the selected table after the related tables for which the selected table does not have insert precedence.
Determining the insert precedence may further comprise: determining, for the selected table and each particular one of the related tables, whether a relationship between the selected table and the particular related table has a constraint; determining, when the constraint exists, whether a foreign key of the constraint is located in the particular related table; concluding that the selected table does not have insert precedence when the constraint does not exist or when the foreign key is not located in the particular related table; and concluding that the selected table does have insert precedence when the foreign key is located in the particular related table.
Determining the delete order may further comprise: identifying, for each selected one of the tables, one or more related tables; determining, among the selected table and each of the related tables, a delete precedence; and ordering the selected table before the related tables for which the selected table has delete precedence, and ordering the selected table after the related tables for which the selected table does not have delete precedence.
Determining the delete precedence may further comprise: determining, for the selected table and each particular one of the related tables, whether a relationship between the selected table and the particular related table has a constraint; determining, when the constraint exists, whether a foreign key of the constraint is located in the selected table; concluding that the selected table does not have delete precedence when the constraint does not exist or when the foreign key is not located in the selected table; and concluding that the selected table does have delete precedence when the foreign key is located in the selected table.
The present invention will now be described with reference to the following drawings, in which like reference numbers denote the same element throughout.