A portion of the disclosure of this patent document, referred to as xe2x80x9cAppendix Axe2x80x9d, contains material, titled xe2x80x9cTransaction Internet Protocol, Version 3.0,xe2x80x9d which is subject to copyright protection. The copyright owner has no objection to the xerographic reproduction by anyone of the patent document or the patent disclosure in the exactly the form it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
The present invention relates generally to transaction processing, and more particularly to techniques for supporting and maintaining a distributed global map of transaction identifiers at the gateway processes and using a hashing algorithm to access these maps.
A transaction is most often defined as an explicitly delimited operation, or set of related operations, that change or otherwise modify the content of an information collection (e.g., database or databases) from one consistent state to another. Changes are treated as a single unit in that all changes of a transaction are formed and made permanent (the transaction is xe2x80x9ccommittedxe2x80x9d) or none of the changes are made permanent (the transaction is xe2x80x9cabortedxe2x80x9d). If a failure occurs during the execution of a transaction, resulting in the transaction being aborted, whatever partial changes were made to the collection arc undone to leave it in a consistent state.
A transaction processing system typically includes a transaction manager; a collection of subsystems, called resource managers (RMs), which are essentially abstractions of available services, such as database systems; application programs; and the like. The transaction processing system provides a way to interconnect applications and resource managers while maintaining data integrity and transactional consistency.
The application process initiating a transaction invokes various services and/or resource managers to perform various operations and tasks necessary to complete the transaction. All services and resource managers invoked to perform operations for the transaction register with a transaction manager, stating that they are joining the transaction. A transaction manager typically provides transaction management functions, such as monitoring the progress of the transaction and coordinating the commit processing and rollback of the transaction, and protects the integrity of user data. When all operations, or work, have completed, the initiating application process notifies the transaction manager of this fact. The transaction manager then initiates an agreement protocol to coordinate commitment processing among all services and resource managers (including foreign transaction managers) participating in the transaction. In transaction processing the standard agreement protocol is the two-phase commitment (2PC) protocol. A description of the 2PC protocol, as well as a detailed overview of transaction processing, is presented in J. Gray et al., Transaction Processing Concepts and Techniques, Morgan Kauffman, 1993, the contents of which are herein incorporated by reference.
Briefly, in phase one of the 2PC protocol, the transaction manager issues a request prepare signal to each participant (i.e., the transaction manager asks each participating service or resource manager if it believes the operations it performed to be a consistent and complete transformation). If any participant votes no, the commit fails and the transaction is aborted and rolled back; if all participating resource managers vote yes (ready to commit), the transaction is a correct transformation and phase two commences. In phase two of the 2PC protocol, the transaction manager issues a commit request signal informing each participant that the transaction is complete, and records this fact in the transaction""s log. After all participants acknowledge the commit request, the transaction manager records this fact and forgets about the transaction.
Recently, a Transaction Internet Protocol (TIP) that uses the 2PC paradigm has been proposed by the Internet Engineering Task Force (IETF). Attached hereto, as Appendix A, is the final version of the IETF paper describing TIP and its requirements. The IETF paper describes a simple 2PC protocol applicable to transactions involving resources in a distributed, Internet-connected transaction. Basically, two models are described: a xe2x80x9cPushxe2x80x9d model and xe2x80x9cPullxe2x80x9d model.
In the Push model, an application on a first transaction processing system requests that the transaction manager of that system xe2x80x9cexportxe2x80x9d a transaction, T1, to a second transaction monitoring system to perform some work on behalf of the application. The transaction manager of the first system xe2x80x9cpushesxe2x80x9d transaction T1 to the second system by sending a message to the transaction manager of the second system. The message requests that the second system start a local transaction associated with transaction T1 as a subordinate of the first system, and return the name, for example xe2x80x9cLT1xe2x80x9d, for that local transaction branch on the second system together with the Internet address of the local transaction branch. The transaction manager forwards to the application the name, LT1, and the internet address of the transaction on the second system associated with transaction T1. The application then sends a message to the desired application on the second system, asking it to xe2x80x9cdo some work, and make it part of the transaction that your transaction manager already knows of by the name of LT1.xe2x80x9d Additionally, the first and second transaction managers each update their own global map by associating the global transaction T1 initiated on the first system with the exported transaction branch LT1. The global map is a data structure that each transaction manager maintains in order to associate any and all remote transaction branches, such as LT1, with associated global transactions, such as T1. Because the first system""s transaction manager knows that it sent the transaction to the second system""s transaction manager, the first system""s transaction manager knows to involve the second system""s transaction manager in the 2PC process.
In the Pull model, an application on the first system merely sends a message to an application on the second system, requesting that it xe2x80x9cdo some work, and make it part of a transaction that my transaction manager knows by the name of T1.xe2x80x9d The application on the second system then requests that its transaction manager enlist in the transaction T1. The second system""s transaction manager xe2x80x9cpullsxe2x80x9d transaction T1 over from the first system and then initiates a local transaction, LT1, associated with transaction T1. Also, both transaction managers update their global maps. As a result of the pull, the first system""s transaction manager knows to involve the second system""s transaction manager in the 2PC process.
In both the Push model and the Pull model, an application on the first system communicates with the second system via a gateway process. In some cases where there is only one gateway process associated with the transaction manager, many applications resident on the transaction system may attempt to communicate with other systems through the single gateway process. This may result in a bottleneck at the gateway, thereby degrading system performance. It is thus desirable to include multiple gateways to enhance system performance. When multiple gateways are used, if a second application desires to export a transaction branch (push or pull) associated with the transaction (T1) to the second system, the transaction manager must check the global map to determine whether transaction T1 has been exported to the second system. If so, the transaction manager returns the local transaction identifier, here LT1, to the application and the application then communicates with the second system. Although this guarantees that the same transaction will never be exported twice to the same remote node, the process requires checking the global map in the transaction manager.
The present invention provides systems and methods for efficiently communicating with remote nodes by using a hashing function to access a distributed global map of transaction identifiers. In particular, the present invention provides techniques for supporting and maintaining a distributed global map of transaction identifiers at the gateway processes and using a hashing algorithm configured on each application process to access the global maps.
The techniques of the present invention allow for efficient communication with remote nodes using multiple gateway processes without the delay associated with checking a central global map of transaction identifiers at the transaction manager to determine the appropriate gateway process for exporting a transaction to a remote node. According to the invention, a portion of the global map of transaction identifiers is maintained at each gateway process. The global map at each gateway associates global transaction identifiers (such as T1 above) with local transaction identifiers (such as LT1). When an application process performing work for a particular transaction desires to export the particular transaction, a hashing function configured on the application process is applied to the global transaction identifier associated with the particular transaction. Application of the hashing function to the global transaction identifier identifies one of the gateway processes. The global transaction identifier is stored in the global map associated with that gateway process. When the remote transaction manager associated with the remote node responds with a local transaction identifier for a local transaction initiated at the remote node on behalf of the exported transaction, the local transaction identifier is stored to the identified gateway""s global map in association with the global transaction identifier.
Each application process is configured with the same hashing function so that the same gateway process will always be identified given a particular global transaction identifier. Thus, if the same or another application process desires to export a transaction that has already been exported to a remote node, the hashing function on that application process identifies the same gateway. The identified gateway will check its global map and will see that the transaction has been previously exported. If the transaction has been previously exported to the same remote node to which the application process is now trying to export a transaction branch, the gateway will return the local transaction identifier associated with that remote node to the application process. In this manner, the same transaction will not be exported twice to the same remote node. If, on the other hand, it is determined that the transaction has not been exported to the desired remote node, the transaction will be exported to that node. When a local transaction identifier is received from the desired node, an entry will be made to the global map associating the transaction (i.e., the global transaction identifier) with that local transaction identifier.
According to one aspect of the invention, a method is provided for communicating with a remote node from a transaction processing system having a transaction manager, one or more application processes and two or more gateway processes for communicating with remote nodes. Each of the gateway processes has a global map that associates global transaction identifiers with local transaction identifiers and each of the one or more application processes is configured with the same hashing function. Application of the hashing function to a particular global transaction identifier always identifies the same gateway process. The method typically comprises the steps of performing work for a first transaction by a first application process, the first transaction having an associated global transaction identifier, and applying the hashing function to the global transaction identifier by the first application process so as to identify a first one of the gateway processes when the first application process desires to export the first transaction to a remote node. The method also typically includes the steps of exporting the first transaction to the remote node, and storing the global transaction identifier to the global map of the first gateway process.
According to another aspect of the invention, a transaction processing system is provided, which typically comprises a first transaction manager (TM) and two or more gateways associated with the TM for communicating with remote nodes. Each of the gateways has a global map that associates global transaction identifiers with local transaction identifiers. The system also typically comprises an application process that performs work for a first transaction, wherein the first transaction has an associated global transaction identifier. The application process is configured with a hashing function, wherein application of the hashing function to a particular global transaction identifier always identifies the same gateway. When the application process desires to export the first transaction to a remote node, the application process applies the hashing function to the global transaction identifier of the first transaction so as to identify a first one of the gateways. The first transaction is then exported to the remote node.
Reference to the remaining portions of the specification, including the drawings and claims, will realize other features and advantages of the present invention. Further features and advantages of the present invention, as well as the structure and operation of various embodiments of the present invention, are described in detail below with respect to the accompanying drawings. In the drawings, like reference numbers indicate identical or functionally similar elements.