Consistency is a fundamental design goal of any transaction processing system. For example, when transferring money from a savings account to a checking account, the total of the two accounts must remain the same. If the savings account is debited but the checking account is not credited, the customer will be dissatisfied. On the other hand, if the checking account is credited but the savings account is not debited, the bank will become concerned.
A transaction is a collection of operations that performs a single logical action in a database application. In the above example, conducting an account transfer is a transaction. Debiting the savings account is one operation in the account transfer transaction, and crediting the checking account is another operation. In order to preserve consistency, every operation of a transaction must succeed or the entire transaction must fail--that is, transactions can leave no work partially done. This requirement is called "atomicity."
Under nonnal conditions, consistency can be enforced in a transaction processing system by simply carrying out every operation in the transaction. However, software bugs, hardware crashes, and power outages can cause a computer system to fail. When a failure occurs, some information, particularly information that is stored in volatile memory such as RAM, may be lost and consistency violated. For example, a banking transaction processing system might debit the savings account but crash before crediting the checking account. To guarantee consistency, a transaction processing system must ensure that all of the operations of a transaction are executed or, if one or more of the operations that make up the transaction fails, none of the operations of the transaction are executed.
If a transaction cannot be successfully completed, and only some of the operations are executed, then the transaction must be "rolled back" and the completed operations undone. On the other hand, if all the operations in a transaction successfully execute, then the transaction is "committed" and all of the operations are permanently stored in a database.
A distributed transaction is one in which operations occur on several different computers or in several different processes in a computer network. For example, checking account operations might occur on a first computer, savings account operations might occur on a second computer, and the request to transfer funds from a savings account to a checking account might originate at a third computer used by a bank teller. Every process or computer that is involved in the transaction is called a "participant." In a distributed transaction, partial failures can occur in which some computers are working while others are not, or where the computers are working but communication links between the computers have failed. Partial failures can also occur where one operation fails because it violates a logical restraint, either in the system or on its local computer.
The benefits of a distributed transaction system include improved performance and scalability, that is, the ability to support additional users by adding additional computers without losing performance. Because failures in a distributed system are usually partial ones, another benefit is improved reliability--if one computer fails, often the system continues to operate. However, the possibility of a partial failure makes the assurance of consistency more problematic. For example, the computer that processes savings account operations might function nonnally, deducting the debit, whereas the computer that processes the checking account operations might fail without adding the credit.
In order to preserve consistency, transaction processing systems implement a "commitment protocol." One common commitment protocol for distributed transactions is the "two-phase commit" protocol. A two-phase commit protocol generally includes two phases: a prepare transaction phase and a resolve transaction phase.
First, in the prepare transaction phase, the creator of a transaction asks each participant to prepare or abort the transaction. Each participant must determine whether it wishes to commit or abort the transaction. If the participant wishes to commit the transaction, it records the fact that the transaction is prepared for commitment to its local transaction log in non-volatile storage. The local transaction log will have already recorded the old and new values of the local changes made by that transaction to the database. Then the participant sends a "commit" vote back to the owner.
If the participant decides to abort, it records an abort of the transaction to non-volatile storage and sends a "roll back" vote back to the owner. There are a number of reasons why a participant might decide to abort. An operation may violate some constraint imposed on the logic of that operation. For example, if debiting the savings account would reduce the balance in the savings account below zero, then that participant would abort the transfer transaction.
Second, in the resolve transaction phase, the creator collects all the votes from the participants. If all the participants voted yes, then the owner records a commit of the transaction to its transaction log in non-volatile storage. At this point the transaction is committed. Then the owner sends a message to each participant to commit the transaction.
If any participant voted to roll back the transaction, then the creator records an abort of the transaction to non-volatile storage, and sends a message to each participant to roll back the transaction. Each participant that placed a prepared to commit record in non-volatile storage will wait for a commit or roll back message from the creator to take action. One important application for a distributed transaction processing system is the Manufacturing Execution System ("MES"). Generally, an MES is an enterprise-wide software application integrated with one or more shop floors in a manufacturing company. Goals of an MES include sending timely shop floor data to the enterprise and incorporating enterprise-wide requirements into the shop floor.
Ideally, an MES should collect data from a variety of factory floor devices such as SCADA, controllers, test equipment, bar code scanners and others. An MES should also allow definition of bills of material and specification of routing steps and operation sequences. An MES can also define shop floor equipment, including its capabilities and requirements, and manufacturing employees, including their training and certifications. Output from an MES can include order tracking, work-in-process traceability, defect and rework management, work instruction management, data collection and statistical process control and statistical quality control ("SPC/SQC"), process licensing, item qualification, labor tracking and finite scheduling. An MES thus improves management's visibility into its manufacturing processes and interacts with enterprise resource planning and supply chain management software and professionals to reduce manufacturing lead times and meet customer and manufacturing deadlines.
Traditional transaction processing systems are inadequate for modern MES applications which must coordinate data flow from around a manufacturing floor and from manufacturing operations around the world in real time or near real time. In addition, manufacturing operations run around the clock seven days a week. Traditional transaction processing systems always include localized bottle necks resulting in reliability problems when local computers go down for any reason. Furthermore, the asynchronous nature of a global manufacturing enterprise and the data it produces can strain traditional transaction processing systems. In traditional systems that process transactions sequentially, waiting for data, sometimes from operations that have not yet occurred, ties up resources that need this data to proceed with a transaction--in a manufacturing execution system, this is a common situation. In addition, the asynchronous nature of the transactions can result in a large amount of network traffic for transaction processing systems with message intensive commitment protocols.
Accordingly, it would be desirable to provide a transaction processing system having the ability to reliably process asynchronous transactions in an MES environment.