The invention pertains to the problem of potential database deadlocks or timeouts due to the locking of resources during transactions, on a content management (CM) system in particular. Databases store data in a variety of manners depending on the internal organization. For example, a relational database system, typically stores data in tables. The tables are comprised of rows, each of which contains a record. The record, in turn, contains entities and the entities contain the actual related data values for a data “object.” Each table may also be associated with one or more indexes, which provide rapid access to the rows in an order determined by the index and based on key data values contained in selected entities in each row. As an example, a row might be associated with each employee of an organization and contain entities that hold such information as the employee name, an identification number, and telephone numbers. One index might order the rows numerically by employee identification number, while another index might order the rows alphabetically by employee name.
Such a database conventionally includes methods which insert and delete rows and update the information in a row. When changes are made to the rows, any database indexes associated with the table may also need to be updated in order to keep the indexes synchronized with the tables. The rows in each table are mapped to a plurality of physical pages on the disk to simplify data manipulation. Such an arrangement is illustrated in FIG. 1.
In FIG. 1, table 10, which illustratively consists of rows 12, 14, 16, and 18, is mapped to a chain of pages of which pages 20, 22, and 24 are shown. In the table illustrated, each row consists of five separate entities. For example, row 12 consists of entities 26, 28, 30, 32 and 34. The entities in each of rows 12, 14, 16 and 18 are mapped illustratively to page 22 which can contain data for more than one row. For example, entity 26 maps to location 36 in page 22. Entity 28 maps to location 38. Entity 30 maps to location 40. In a similar manner entity 32 maps to location 42 and entity 34 maps to location 44. The entities in the next row 14 are mapped directly after the entities in row 12. For example, entity 46 is illustrated and maps to page location 48. When the page is completely filled with data, entity information is mapped to the next page in the page chain. The pages are chained together by means of page pointers. For example, page pointer 50 links pages 20 and 22, whereas page pointer 52 links pages 22 and 24. All of the pages used to store the data in table 10 are linked together in a similar manner in a page chain.
The data pages are normally kept in a page buffer pool located in system memory. In order to make such a database system persistent or “durable”, the data pages must be written to an underlying non-volatile storage system, such as a disk storage. This storage operation takes place on a page level so that when a modification is made to data on a page the entire page is stored in the persistent storage. Each page could be copied to the persistent storage as soon as data on the page was modified. However, this immediate copying greatly slows the system operation since persistent storage is generally much slower than RAM memory. Alternatively, the information in modified pages in the buffer pool can be copied or “flushed” to the disk storage at intervals. For example, the information could be flushed periodically or when the number of changed pages in the buffer pool reaches some predetermined threshold. During this disk flushing operation, the data modifications are performed “in place” so that the old data is either overwritten or deleted from the disk and lost.
Since the data is lost during the modification process, in order to ensure data integrity in the case of a system failure, or crash, the actions performed on the database are grouped into a series of “transactions”. Each transaction is “atomic” which means that either all actions in the transaction are performed or none are performed. The atomic property of a transaction ensures that the transaction can be aborted or “rolled back” so that all of the actions which constitute the transaction can be undone. Database transactions commonly have a “commit” point at which time it can be guaranteed that all actions which comprise the transaction will complete properly. If the transaction does not reach the commit point, then it will be rolled back so that the system can return to its state prior to the initiation of the transaction. Consequently, if there is a system termination or crash prior to the commit point, the entire transaction can be rolled back.
The use of a buffer pool complicates transaction processing because even though a transaction has committed, system operation could terminate after a page has been modified, but before the modified page is flushed to disk. In order to prevent data loss caused by such a system interruption, a logging system is used to permit data recovery. The logging system records redo and undo information for each data modification in a special file called a “recovery log” that is kept in non-volatile storage.
During the processing of a CM transaction, it is to be appreciated that locks are placed on database pages and resources so that a second concurrent CM transaction does not replace entities, unknown to the first CM transaction, before the first CM transaction has modified selected entities and performed a write operation for those modifications. Additionally, many systems maintain add the restriction that all write locks created by a CM transaction should be held until the transaction commits.
A problem that arises with CM transaction schedulers is that transactions can get involved in deadlocks or can time-out waiting for a resource to be released from a lock. CM transactions sometimes have to wait for locks where such waiting is caused by another transaction holding a conflicting lock, and the waiting transaction cannot make any progress until the other transaction releases its lock. If two CM transactions are waiting for each other, neither can make progress until the other one releases its lock. As long as neither of them releases its lock, the two transactions are deadlocked. More generally, deadlocks can involve more than two CM transactions that are waiting for each other in a cyclic way.
Therefore, it is desirable to provide a method and apparatus which can reduce the potential for deadlocks and time-outs caused by resource locking, particularly in a high volume CM system.
The present invention therefore provides a solution to the aforementioned problems, and offers other advantages over the prior art.