With respect to data processing systems that process high rates of requests or transactions (also referred to herein as transaction processing systems), it is known that such transaction processing systems need to be fault tolerant. In order to handle failure of a single processing node, the system typically utilizes multiple processing nodes. That way, if one node fails, at least one other node is available to continue processing requests.
In general, the nodes processing requests may have state. In order for a node n2 to take over for a failed node n1, the state of n2 has to be updated with the state of n1. One way this has been done in the past is to have a primary node along with a back-up node that follows the same transactions as the primary node, but a few steps behind. That way, if the primary fails, the back-up can take over for the primary.
A key problem with this approach is that some work needs to be done in the event of a failure of the primary, both in detecting the primary failure and then in getting the back-up to take over for the primary. In many mission-critical environments, this disruption in the event of a failed primary is not acceptable.
Accordingly, what is needed is improved techniques for processing transactions in a data processing system.