The present invention relates to database systems. More specifically, the present invention relates to conflict detection in an object-relational database system.
In conventional relational databases, all data are stored in named tables. The tables are described by their features. In other words, the rows of each table contain objects of identical type, and the definitions of the columns of the table (i.e., the column names and the data types stored in the column) describe the attributes of each of the instances of the object. By identifying its name, its column names and the data types of the column contents, a table is completely described.
Relational databases offer several advantages. Database queries are based on a comparison of the table contents. Thus, no pointers are required in relational databases, and all relations are treated uniformly. Further, the tables are independent (they are not related by pointers), so it is easier to maintain dynamic data sets. The tables are easily expandable by simply adding new columns. Also, it is relatively easy to create user-specific views from relational databases.
There are, however, a number of disadvantages associated with relational databases as well. For example, access to data by reference to properties or attributes is not optimal in the classical relational data model. This can make such databases cumbersome in many applications.
Another recent technology for database systems is referred to as object oriented database systems. These systems offer more complex data types in order to overcome the restrictions of conventional relational databases. In the context of object oriented database models, an “object” includes both data and the functions (or methods) which can be applied to the object. Each object is a concrete instance of an object class defining the attributes and functions of all its instances. Each instance has its unique identifier by which it can be referred to in the database.
Object oriented databases operate under a number of principles. One such principle is referred to as inheritance. Inheritance means that new object classes can be derived from another class by inheritance. The new classes inherit the attributes and methods of the other class (the super-class) and offer additional attributes and operations. An instance of the derived class is also an instance of the super-class. Therefore, the relation between a derived class and its super-class is referred to as the “isA” relation.
A second principle related to object oriented databases is referred to as “aggregation.” Aggregation means that composite objects may be constructed as consisting of a set of elementary objects. A “container object” can communicate with the objects contained therein using methods of the contained objects. The relation between the container object and its components is called a “partOf” relation because a component is a part of the container object.
Yet another principle related to object oriented databases is referred to as encapsulation. According to encapsulation, an application can only communicate with an object through messages. The operations provided by an object define the set of messages which can be understood by the object. No other operations can be applied to the object.
Another principle related to object oriented databases is referred to as polymorphism. Polymorphism means that derived classes may re-define methods of their super-classes.
Objects present a variety of advantages. For example, operations are an important part of objects. Because the implementations of the operations are hidden to an application, objects can be more easily used by application programs. Further, an object class can be provided as an abstract description for a wide variety of actual objects, and new classes can be derived from the base class. Thus, if an application knows the abstract description and using only the methods provided by an object class, the application can still accommodate objects of the derived classes, because the objects in the derived classes inherit these methods. However, object oriented databases are not yet as widely used in commercial products as relational databases.
Yet another database technology attempts to combine the advantages of the wide acceptance of relational databases and the benefits of the object oriented paradigm. This technology is referred to as object-relational database systems. There are generally two variants of O-R systems: one adds the capabilities to the database management system itself; the other is external to the database and addresses the “impedance mismatch” between objects and relational tables. Unless and until objects run in the database, such systems will be necessary. Some of these databases employ a data model that attempts to add object oriented characteristics to tables. All persistent (database) information is still in tables, but some of the tabular entries can have richer data structure. These data structures are referred to as abstract data types (ADTs). An ADT is a data type that is constructed by combining basic alphanumeric data types. The support for abstract data types presents certain advantages. For example, the operations and functions associated with the new data type can be used to index, store, and retrieve records based on the content of the new data type.
In a multi-user environment, there are generally two types of approaches for detecting conflicts when updating data in a database. These are known as pessimistic concurrency and optimistic concurrency.
Pessimistic concurrency involves locking rows at the datasource to prevent users from modifying data in a way that affects other users. In a pessimistic model, when a user performs an action that causes a lock to be applied, other users cannot perform actions that would conflict with the lock until the lock owner releases it. This model is primarily used in environments where there is heavy contention for data, and/or where the cost of protecting data with locks is less than the cost of rolling back transactions if concurrency conflicts occur.
In a pessimistic concurrency model, a user who reads a row with the intention of changing it establishes a lock. Until the user has finished the update and released the lock, no one else can change that row. For this reason, pessimistic concurrency is usually implemented when lock times will be relatively short as in programmatic processing of records.
In contrast, optimistic concurrency does not lock a row when a user is reading it. When a first user wants to update data, the application generally determines whether another user has changed the affected data after the first user read the data, but before the first user attempted to update the row. Optimistic concurrency is generally used in environments with a low contention for data. This improves performance as no locking of records is required, and locking of records acquires additional server resources. Also, in order to maintain record locks, a persistent connection to the database server is generally required. Because this is not the case in an optimistic concurrency model, connections to the server are free to serve a larger number of clients in less amount of time. Another reason for optimistic concurrency in the world of the Internet is that connections cannot be held across a firewall.
Optimistic concurrency is typically effected using one of two common algorithms: row level conflict detection and column (or field level) conflict detection. In row level detection, if a row in a table has changed after the row has been read but before data is written to the row, a conflict is registered. This is normally achieved by checking a time stamp column, or comparing the existing values on the row with the original values in the row when the row was first read, to ensure that none of the values have changed. Column level detection determines if any of the columns, or fields, that were read and modified by a user have been changed (by another user) since they were read by the first user. If this occurs, a conflict is registered. This is achieved by comparing the original values of the changed columns. These known methods of managing conflicts within a multi-user database system do not adequately provide for efficient object-relational database use. The limitations of these approaches are illustrated in the following examples.
Column level detection is not always correct in that, in some cases, a real world conflict does in fact occur and is not detected by column level detection. If two columns are conceptually linked, users can still make independent simultaneous changes to each column without generating a conflict. For example, assume a bank has an account table with two columns: balance_in_savings, and balance_in_checking and an additional policy that if the aggregate of the savings and checking balances is above $10,000.00 that there is no account surcharge for doing transactions. Column level conflict detection would fail to detect conflict in the following case. Assume that Joe Smith has a checking balance of $999.00 and a savings balance of $9,000.00. Users A and B both read the Joe Smith entry containing these two balances. User A adds $2.00 to the savings balance and User B subtracts $2.00 from the checking balance. User B updates first, making the combined balance $9,997.00. However, User A thinks the ending balance is above $10,000.00 and accordingly does not assess a surcharge and updates. This is incorrect.
Row-level detection is always correct, but can reduce concurrency. This is especially problematic in object-relational databases where more than one object can be independently stored on the same row of a table. For example, where one row contains both user reference data (e.g. the customer address) and accumulators (e.g. customer activity over time in number of invoices, dollars, etc), updating reference data and updating accumulators at the same time is reasonable and should not generate a conflict. However, row-level conflict detection will register a conflict and prohibit simultaneous updates of such data.
Another limitation of current conflict detection is that it currently does not recognize changes to different tables as part of the same logical unit of work. For example, a change to an order in a table will not conflict with a change to an order line that is in a different table, order_line, but is part of the same order. Further, when an object includes fields for a class, which fields are mapped to columns of multiple tables, current conflict detection does not register conflicts even though users can be simultaneously modifying data for the same object.
Thus, using current conflict detection strategies with today's object-relational database systems will either not guarantee that conflicts are generated when they should be, or unduly reduce concurrency. Providing a solution to these problems would facilitate the widespread use of object-relational database systems in multi-user environments.