The present invention relates to database environments, and more particularly to method and system for handling foreign key database updates in such environments.
Today""s object-oriented database environments are typically used as front-ends to more simplistic but efficient data models, such as flat files and relational databases. In such an environment, a client submits a semantic-rich query at the object-level and the system generates an execution plan specifying what queries to run against the relational databases, and how to combine those results to compute the final result.
In a relational database, data is perceived by its users as a collection of tables. The tables in a relational database include a row of column names specifying one or more attribute fields, and zero or more data rows containing one scalar value for each of the column fields. Objects referred to in the query received from the client are identified by a primary key. And the row in the underlying relational database table corresponding to this object is also uniquely identified by this primary key. A foreign key is an attribute(s) in a table that forms a relationship with another table by storing the primary key value of the related table.
Given a set of database tables, foreign key relationships are not necessarily present. If foreign key relationships are present in the set of tables, then the foreign key relationships are classified as either acyclic or cyclic.
FIG. 1 is a diagram illustrating acyclic foreign key relationships between example database tables. In the example, three tables are shown: Department, Project, and Employee. Each table includes one or more rows, although only one row is shown, and each row includes a primary key (designated with a * symbol) that uniquely identifies that row. In the Department table, the attribute xe2x80x9cdepnoxe2x80x9d is the primary key. In the Project table, the attribute xe2x80x9cprojnoxe2x80x9d is the primary key. And in the Employee table, the attribute xe2x80x9cempnoxe2x80x9d is the primary key.
As shown by the arrows, a foreign key relationship exists between the Employee table and the Department table and between the Employee table and the Project table, because the Employee table includes the primary key xe2x80x9cdepnoxe2x80x9d from the Department table and the primary key xe2x80x9cprojnoxe2x80x9d from the Project table. The Department and Project tables are referred to as parent tables since they are the source of the primary keys in the Employee table. The Employee table is referred to as the child because it contains the foreign keys. The Employee table is dependent because its foreign key values are constrained to be values of the primary keys in the parent tables.
FIG. 2 is a diagram illustrating cyclic foreign key relationships between example tables. In the example, three tables are shown: Department, Employee, and Network. As before, the primary key of the Department table is xe2x80x9cdepnoxe2x80x9d and the primary key of the Employee table is xe2x80x9cempnoxe2x80x9d. The Network table""s primary, key is xe2x80x9cnetnoxe2x80x9d.
A cycle of foreign key relationships exists between the tables because there is a closed loop of child/parent relationships between the tables. As shown, the Department table is a child of the Network table since it includes the Network table""s primary key xe2x80x9cnetnoxe2x80x9d, the Network table is a child of the Employee table since it includes the Employee table""s primary key xe2x80x9cempnoxe2x80x9d, and the Employee table is a child of the Department table since it includes the Department table""s primary key xe2x80x9cdepnoxe2x80x9d. In contrast, the acyclic example in FIG. 1 does not include a closed loop of child/parent relationships between the tables.
During operation of the database, an application may perform a number of object creation, removal and modification operations. The operations are typically implemented through commands such as Insert row, Delete row, and Update row, respectively. The application may pass one or more such row operations to the database and require that the changes be applied to the database at transaction commit time.
At commit time, the operations may alter one or more parent-child relationships between two objects by creating a new child for a parent, moving a child from one parent to another, removing a child""s parent, or removing a child. When the parent-child relationship changes, the foreign key values need to be updated accordingly. However, the relational database imposes referential integrity (RI) constraints on foreign key updates. For example, a parent row""s delete must be preceded by all children rows"" deletes, and any child row can be inserted only after the parent has been inserted. Stated more formally, the RI constraints are:
a) a parent with a primary key must be inserted into the database or already exist before that foreign key may be assigned to a child row, and
b) a child row having a foreign key must be deleted before the parent with the primary key can be deleted.
When the list of row operations to be performed is received by the database, the insert, update and delete of the rows must be ordered to satisfy these RI constraints.
Conventional methods for ordering the row operations take all the rows to be modified and order them in a list to satisfy the constraints. This entails sequentially traversing the row operations and for each row finding rows that have corresponding children and/or parents. The child/parent rows are then examined to determine whether the child/parent needs to be modified before the current row. If it does, the child/parent row is placed in the list before the current row. If the child/parent row needs to be modified after the current row, then it is placed in the list after the current row. The result of this process is a linked list whose order defines the order of the updates in a way that satisfies all RI constraints. Although this method effectively accomplishes the task of ordering the row operations, the method is computationally expensive if there are many rows to be modified.
Another issue with database row operations is the case where foreign and primary keys overlap. A foreign or primary key can contain multiple attributes, and two keys overlap when they have one or more attributes in common. A foreign key may partially or completely overlap with other foreign or primary keys.
Consider, for example, a child table T1 with a foreign key (A1, A2, A3), where A1, A2 and A3 are the attributes in the foreign key. When a user updates the foreign key for a given row in table T1, the value of some or all the attributes in the foreign key may change. Key overlap poses certain challenges when foreign keys are modified.
In case where a foreign key overlaps a primary key, the update of the foreign key may or may not be allowed. When an update is allowed, it invalidates any other foreign key the current key partially but not completely overlaps with. The task is to define which foreign key updates should be allowed and which ones should be forbidden.
To handle a row updates efficiently, the system must determine for each attribute in a row to be updated, the keys (foreign or primary) the attribute overlaps with, the keys the attribute completely contains, and the keys the attribute is completely contained by. This information determines whether to allow the foreign key update or not (it will not be allowed if it changes a primary key attribute value). It also determines what other foreign keys are invalidated by this update.
In conventional systems, this information is maintained in linked lists. Unfortunately, however, creating and scanning large linked lists is memory intensive and can be slow.
Accordingly, what is needed is a method and system for more efficiently ordering the database row operations to satisfy the RI constraints, and for detecting and handling other foreign and primary key updates when an overlapping foreign key is updated. The present invention addresses such needs.
The present invention is a method and system for handling foreign key database updates. The database includes one or more tables where each table includes at least one row and a primary key or foreign key. The method and system include evaluating a list of row operations for foreign key relationships. After evaluating the foreign key relationships, the tables determined to have acyclic foreign key relationships are grouped into a first set of tables and the tables determined to have cyclic foreign key relationships are grouped into a second set of tables. The method and system further include ordering the first set of tables into a list based on the foreign key relationships among the tables, such that a parent table will be processed before a child table. The row operations are then applied to each table in the list in the specified order. In a second aspect of the present invention, overlapping foreign and primary keys are represented as a bit map to increase processing speed.
According to the method and system of the present invention, the order that the row operations are applied is based on table ordering, rather than row ordering, which speeds processing since in almost every case, there are often less tables to order than rows. In addition, tables having acylic foreign key relationships are processed separately from tables having cyclic foreign key relationships. The method and system use an optimized algorithm for handling foreign key updates for acylic tables, taking advantage of the fact that acyclic relationships occur much more frequently than cyclic one, thereby increasing overall speed of the system.