The development of the EDVAC computer system of 1948 is often cited as the beginning of the computer era. Since that time, computer systems have evolved into extremely sophisticated devices, and computer systems may be found in many different settings. Computer systems typically include a combination of hardware (such as semiconductors, integrated circuits, programmable logic devices, programmable gate arrays, and circuit boards) and software, also known as computer programs.
Years ago, computers were isolated devices that did not communicate with each other. But, today computers are often connected in networks, such as the Internet or World Wide Web, and a user at one computer, often called a client, may wish to access information at multiple other computers, often called servers, via a network. Accessing and using information from multiple computers is often called distributed computing.
One of the challenges of distributed computing is the propagation of messages from one computer system to another. In many distributed computing systems connected via networks, to maintain data consistency it is critical that each message be delivered only once and in order to its intended destination site. For example, in a distributed database system, messages that are propagated to a destination site often specify updates that must be made to data that reside at the destination site. The updates are performed as a “transaction” at the destination site. Frequently, such transactions are part of larger distributed transactions that involve many sites. If the transactions are not delivered once and in order, problems with data consistency may occur, e.g., if database insert and update operations are out of order, the update attempts to modify a record that is not yet present.
To ensure safe data sharing in a distributed computing environment, transactions must share the properties of atomicity, consistency, isolation, and durability, denoted by the acronym ACID. Atomicity means that a transaction is considered complete if and only if all of its operations were performed successfully. If any operation in a transaction fails, the transaction fails. Consistency means that a transaction must transition data from one consistent state to another, preserving the data's semantic and referential integrity. While applications should always preserve data consistency, many databases provide ways to specify integrity and value constraints, so that transactions that attempt to violate consistency will automatically fail. Isolation means that any changes made to data by a transaction are invisible to other concurrent transactions until the transaction commits. Isolation requires that several concurrent transactions must produce the same results in the data as those same transactions executed serially, in some (unspecified) order. Durability means that committed updates are permanent. Failures that occur after a commit cause no loss of data. Durability also implies that data for all committed transactions can be recovered after a system or media failure. An ACID transaction ensures that persistent data always conform to their schema, that a series of operations can assume a stable set of inputs and working data, and that persistent data changes are recoverable after system failure.
One approach for ensuring that transactions are ACID in a distributed system is to use a two-phase commit protocol to propagate messages between the distributed computer systems. The two-phase commit protocol involves two phases: the prepare phase and the commit phase. In the prepare phase, the transaction is prepared at the destination site. When a transaction is prepared at a destination site, the database is put into such a state that it is guaranteed that modifications specified by the transaction to the database data can be committed. Once the destination site is prepared, it is said to be in an in-doubt state. In this context, an in-doubt state is a state in which the destination site has obtained the necessary resources to commit the changes for a particular transaction, but has not done so because a commit request has not been received from the source site. Thus, the destination site is in-doubt as to whether the changes for the particular transaction will go forward and be committed or instead, be required to be rolled back. After the destination site is prepared, the destination site sends a prepared message to the source site, so that the commit phase may begin.
In the commit phase, the source site communicates with the destination site to coordinate either the committing or rollback of the transaction. Specifically, the source site either receives prepared messages from all of the participants in the distributed transaction, or determines that at least one of the participants has failed to prepare. The source site then sends a message to the destination site to indicate whether the modifications made at the destination site as part of the distributed transaction should be committed or rolled back. If the source site sends a commit message to the destination site, the destination site commits the changes specified by the transaction and returns a message to the source site to acknowledge the committing of the transaction.
Alternatively, if the source site sends a rollback message to the destination site, the destination site rolls back all of the changes specified by the distributed transaction and returns a message to the source site to acknowledge the rolling back of the transaction. Thus, the two-phase commit protocol may be used to attempt to ensure that the messages are propagated exactly once and in order. The two-phase commit protocol further ensures that the effects of a distributed transaction are atomic, i.e., either all the effects of the transaction persist or none persist, whether or not failures occur.
Although two-phase commit processing can work well, it is expensive because of the high level of control communications and network traffic messages. In transaction processing systems, committing updates on completion of a transaction involves a relatively high processing overhead, which hurts performance. An alternative to the two-phase commit processing is one-phase commit processing, where a single site makes its own commit and rollback decisions without depending on other sites. Unfortunately, one-phase commit processing does not guarantee the ACID properties when multiple sites are involved.
If two or more resources are involved in a transaction, then two-phase commit control processing is used, along with its high overhead. But, if only a single resource is being used within a transactional context, then one-phase commit processing may be used, which has less overhead than two-phase commit processing.
Without a better way to handle two-phase commit processing, transactions will continue to suffer with impaired performance. Although the aforementioned problems have been described in the context of database transactions, they may occur in any type of transaction or application. Further although the source and destination sites have been described as if they exist on different computers attached via a network, some or all of them may be on the same computer.