1. Field of the Invention
Embodiments of the invention generally relate to fault tolerant systems and, more specifically, to a method and apparatus for transactional fault tolerance in a client-server system.
2. Description of the Related Art
Many different approaches to fault-tolerant computing are known in the art. Fault tolerance is the ability of a system to continue to perform its functions, even when one or more components of the system have failed. Fault-tolerant computing is typically based on replication of components and ensuring for equivalent operation between the components. Fault-tolerant systems are typically implemented by replicating hardware, such as providing pairs of servers, one primary and one secondary. One type of fault tolerant mechanism involves each of the servers executing the same software. The replicated software is arranged to operate in lockstep during normal operation and a mechanism is provided to detect a failure of lockstep. Another approach involves periodic snapshots with logging and replay of events after failover. However, any approach that requires replay of events after a failover or one that requires primary and secondary servers to execute in lockstep involves solving complex issues dealing with deterministic replay of instructions on the secondary server. The problem is exacerbated in multi-processor systems. Accordingly, there exists a need in the art for a fault tolerant system that does not require logging and replay or lockstep execution.