The present invention relates generally to the field of databases, and more particularly to transaction control within a database management system (DBMS).
Databases are critical pieces of the IT infrastructure required by businesses, organizations, and government function of virtually any size. Databases act as a central repository of information and eliminate the laborious task of searching for information within hardcopy files that have fixed locations or electronic files which only contain portions of the information. Databases allow the user to not only access information but also work with the information stored within the database. Access to databases can occur locally or by using the Internet or wireless technologies (e.g., smartphones). Databases can be accessed from virtually anywhere in the world. Databases have grown from repositories of information for a single individual or business to data warehousing, handling a plurality of information (e.g., photographs, personal information, news articles, medical information, etc.). For example, e-commerce, on-line banking, and on-line brokerage account applications make extensive use of databases. Guaranteeing the security and integrity (e.g., reliability) of the transactions and the effect those transactions have on the data within a database is important.
Computer science outlines a set of properties that guarantee that database transactions are processed reliably. These properties are atomicity, consistency, isolation, and durability (ACID). Atomicity requires that each transaction be “all or nothing”; if one part of a transaction fails, the entire transaction fails, and the state of the database is left unchanged. Consistency ensures that any transaction will bring the database from one valid state to another. Any data written to the database must be valid according to all defined rules, including constraints, cascades, triggers, and any combination thereof. Examples of consistency implementations are that any transactions started in the future necessarily see the effects of other transactions committed in the past; that database constraints are not violated, particularly once a transaction commits; and that operations in transactions are performed accurately, correctly, and with validity with respect to application semantics. Isolation ensures that the concurrent execution of transactions results in a system state that would be obtained if transactions were executed serially (e.g., one after the other). Providing isolation is the main goal of concurrency control. Using concurrency control methods, the effects of an incomplete transaction are not visible to another transaction. Durability means that once a transaction has been committed, it will remain so, even in the event of power loss, crashes, or errors. For example, transactions and/or their effects are recorded in a non-volatile memory.
Database management systems (DBMS) employ various methods (e.g., locking, multi-version concurrency control (MVCC), etc.) to provide ACID capabilities for transactions and ensure ACID compliance. Multi-version concurrency control (MCC or MVCC) of a database provides each read transaction the prior, unmodified version of data that is being modified by another active transaction. When an MVCC database needs to update an item of data, it will not overwrite the old data with new data but instead mark the old data as obsolete and add the newer version elsewhere. Thus, there are multiple versions stored but only one version is the latest. Snapshot isolation is implemented within MVCC. Snapshot isolation is a guarantee that all reads made in a transaction will see a consistent snapshot of the database, and the transaction itself will successfully commit only if no updates it has made conflict with any concurrent updates made since that snapshot. Read transactions under MVCC typically use a timestamp or transaction ID to determine what state of the database to read and read these versions of the data. This avoids managing locks for read transactions because writes can be isolated by virtue of the old versions being maintained, rather than through a process of locks or mutexes. This allows read transactions to access the data that was present when they began reading, even if it was modified or deleted part way through by some other write transaction. Writes affect a future version, but at the transaction ID (e.g., timestamp) that the read is working at, everything is guaranteed to be consistent because the writes are occurring at a later transaction ID. MVCC requires (generally) the system to periodically sweep through and delete the old, obsolete data objects.