Today many systems, for example computer systems of financial institutions are isolated in the sense that, although they are able to communicate with each other via using for example off line batch oriented exchange of transactions files, they are not able communicate online using bidirectional communication protocols.
For this reason, communicating between isolated systems is usually done in a fire and forget manner. In such manner a requester application is able to continue its processing after submitting a request regardless of whether the service providers needed to perform the tasks are immediately available or not. Besides that, the user initiating the request does not have a functional need to wait until the request is fully processed. The service requesters and service providers must have the capability to produce and consume messages at their own pace without directly depending upon each other to be present. However, in practice most messages are sent over unreliable networks and might in worst case be lost.
The problem is that when multiple systems are involved in the same transaction the reliability of the communication becomes more important.
An atomic transaction is a series of operations where either all occur, or nothing occurs. A guarantee of atomicity prevents for example updates to a database occurring only partially, which can cause greater problems than rejecting the whole series outright.
To be able to have an atomic transaction over all involved systems, it's important to trust that messages are correctly received and acted upon. But to complicate the situation even more, each system may also have its own rules for approving or denying a part of the sub transaction.
The impact is that it might be that some of the sub transactions are finished while others are incomplete.
An example is a financial transaction with two different parties, a bank and a power company. The transaction is initiated by a third party. The use case is that a customer wants to buy an electricity voucher from the power company and pay with money from his bank account.
There are two sub transactions involved that must be carried out or not. The first sub transaction is the movement of money in the bank. The second is the issuing of the voucher. The possible errors are that there might not be money on the bank account or the delivery of the voucher fails.
Solutions today are usually implemented as the aforementioned fire and forget pattern where emphasis is laid on the success case, i.e. that all sub transactions are successfully executed.
One example of a computer system using such fire-and-forget pattern is disclosed in WO 2006079001 which describes a system for transferring data related to financial transactions over a public network including a public network, and a plurality of participant computer systems. Each participant computer system is in communication with the public network. Each participating computer system includes one or more applications, and a gateway in communication with the public network. Each application is in communication with a gateway. Each application transmits and receives one or more messages to one or more participant computer systems to perform a function related to a financial transaction. The gateway provides a standard interface for sending and receiving data between applications over the public network.
A problem with the fire and forget pattern is that error cases become complicated to roll back or resolve in other ways. In some cases it becomes even impossible to resolve the correct status if some sub transactions have been successfully executed while some have not.
The outcome is that the global transactions will end up in an undefined state where some parts are committed and some are rolled back. This makes manual clean up necessary. For example, in the financial transaction involving the bank and the power company an error in the issue of voucher would require the reversal of the funds transfer.