A database is a collection of stored data that is logically related and that is accessible by one or more users. A popular type of database is the relational database management system (RDBMS), which includes relational tables made up of rows and columns. Each row represents an occurrence of an entity defined by a table, with an entity being a person, place, or thing about which the table contains information. To extract data from, or to update, a relational table, queries according to a standard database query language (e.g., Structured Query Language or SQL) are submitted to the database system. A table (also referred to as a relation) is made up of multiple rows (also referred to as tuples). Each row (or tuple) includes multiple columns (or attributes).
A popular application of database management system is data warehousing. A data warehouse is basically a collection of data received from various sources. One example type of data warehouse is in the retail context, where information regarding shopping habits, histories, and other information of customers are collected and stored in the data warehouse. Traditionally, new data is loaded into a data warehouse in batches at regular time intervals (e.g., every night). As a result, enterprises often make decisions based on stale data.
To address this issue, operational data warehouses are used. In an operational data warehouse, new data is loaded into the data warehouse in real time (or close to real time). Typically, a load utility is used to continually load data into the data warehouse. Data loading is accomplished by use of update transactions submitted to the database management system in which the data warehouse is implemented.
An issue associated with database system operation is the occurrence of deadlock among different transactions. The occurrence of deadlock is even more likely when there are multiple transactions modifying base relations of a materialized view. A materialized view stores a pre-computed query result that can be used for satisfying certain subsequent queries. Computing the result from a materialized view is generally more computationally efficient than computing the result from base relation(s).
One type of a materialized view is a join view, which stores results of a join of multiple base relations. In a procedure referred to as materialized view maintenance, a materialized view is updated when underlying base relations are modified. As base relations are changed through insertion of rows, deletion of rows, or updating of rows, the corresponding rows of the materialized view are also modified.
To maintain data consistency, each transaction in a database system typically places some type of lock on relations and materialized views that are involved in the transaction. In some scenarios, the conflicting locks for multiple transactions are placed on the relations and materialized views in such an order that none of the multiple transactions can proceed further—a deadlock condition. A deadlock among transactions reduces the ability of transactions to successfully complete in a database system.
In an operational data warehouse, a load utility is continually loading data into a database system, which causes generation of a relatively large number of transactions to perform the data loading. The occurrence of deadlock conditions during the loading of data in an operational data warehouse will cause data warehouse performance to suffer.