Large organizations today face many challenges when engineering large custom software applications. For example, these custom software applications may be required to meet the needs of various departments within the organizations. These custom software applications may be required to interface with other software applications used by users located at different geographic locations. These organizations commonly use database-centric applications (DCAs) which in turn use nontrivial databases. DCAs are applications that interact directly with databases, and data related to the DCAs are stored in the databases. The DCAs typically contain multiple concurrent threads and processes that access and manipulate one or more shared databases concurrently. DCAs typically use shared databases without notifying other applications that also use the shared databases. In addition, different applications are often developed by different departments within the organizations or by different organizations without coordinating with one another. Sharing databases thus leads to a high incidence of concurrency errors known as database deadlocks, which is one of the main reasons for major performance degradation in the DCAs.
Concurrency errors, or database deadlocks occur when two or more threads of execution lock one resource while waiting for other resources. When this occurs in a circular fashion, the database is “deadlocked.” Database deadlocks are special kinds of deadlocks that result from interactions between DCAs and shared databases. To date, database deadlocks are typically detected using various algorithms and resolved using time-out mechanisms provided by a Database Management System (DBMS). Once a deadlock occurs, the DBMS rolls back or reverses the transactions that caused it. For example, the DBMS may unlock resources or cancel waiting requests from other threads of execution when one resource is being used.
One way of avoiding the deadlock involves the use of Structured Query Language (SQL), which is a computer language designed for managing data in a DBMS. The process of avoiding deadlock may include compiling SQL statements, creating and optimizing an execution plan designed to resolve the deadlock, and simulating its execution to determine how basic relational operators in the DBMS would access and manipulate data elements. In practice, this strategy means reproducing the work that the DBMS performs as part of executing transactions, which results in significant overhead that may double the execution time of each SQL statement. In addition, the timeout mechanism and the ensuing transaction rollback causes significant performance degradation in the DCAs involved. Performance degradation in DCAs may in turn result in significant financial costs to organizations running DCAs and their clients. Further, the possibility of concurrency errors has forced software developers to adopt a “defensive” programming style, which stifles productivity.
In another aspect, database deadlocks are fundamentally different because they depend on the structure and the content of a database that is shared by threads or processes of the multiple DCAs that use the database. DBMSes provide layers of abstractions to guarantee Atomicity, Consistency, Isolation, and Durability (ACID) properties. The atomicity property ensures that each transaction in a database is atomic such that if any part of a transaction fails, the entire transaction fails, thus leaving the database unaffected. The consistency property ensures that the database remains in a consistent state, such that if a transaction is executed that violates the database's consistency rules, the transaction could be rolled back or reversed to the pre-transaction state. The isolation property ensures that other transactions do not access data that has been modified during a transaction that has not yet completed. The durability property ensures that the DBMS can recover any transaction updates in the event of a system failure. These properties help to make database transactions process reliably. But, these properties do not prevent deadlocks, which are viewed as the responsibility of database users and/or programmers. However, it is difficult to determine in advance what SQL statements these users will execute. It is similarly difficult to determine in advance all users of the systems and what DCAs the users may run that will share the same database.
Moreover, since database deadlocks are not explicit, i.e., there is no clear indication as to which DCAs may potentially create the deadlock, they cannot be easily identified by analyzing source code of the DCAs that may be involved in the deadlocks. This problem is aggravated by the fact that nontrivial DCAs are highly concurrent and use multiple levels of abstraction. Database deadlocks appear often after applications grow in complexity in different ways: more DCAs are added; more data is added to the database; transactions become longer; new transactions are added; the order of lock requests is modified; the type of the database is changed, etc. It is a time consuming task to detect database deadlocks in DCAs by analyzing their source code. Currently, it is the job of the database engine to detect deadlocks and resolve them by discarding some transactions. By the time these deadlocks are resolved, damage to system performance is done. In addition, certain discarded transactions may result in errors and exceptions inside the DCAs that issued the cancelled transactions resulting in the loss of valuable and intensive computations that have already been performed by the transactions prior to their cancellation.
Identifying database deadlocks thus requires sophisticated reasoning regarding the behavior of DCAs and the databases that they use. This process is overwhelmingly complex and may impose a significant cost to achieve desired performance of software. It is thus desirable to provide systems and methods to handle the database deadlocks.