1. Technical Field
The present invention generally relates to computer systems and in particular to checkpoint operations within computer systems.
2. Description of the Related Art
Checkpoint operations enable a computer to backtrack to a previously acceptable machine state. As an alternative to completely restarting an execution, checkpoint operations provide a point whereby implementation of a process may be resumed when a fault occurs during execution. Existing forms of checkpoint operations enable saving the checkpoint state of a machine in a fault-tolerant memory as changes are made to the main memory between checkpoints. Each time a memory location is written, select contents of the main memory are recorded in a first in first out (FIFO) buffer. If a fault occurs, the state of the memory at the end of the last checkpoint cycle can be reconstructed by reading the FIFO and rewriting the main memory from the contents thereof.
Although the concept of checkpoint operations for a processor state has existed for some time in support of fault tolerance, the methods are not efficient. Recent work in speculative software optimizations, such as transactional memory, has motivated support for more efficient software-initiated generation of checkpoint operations. The optimal design of a checkpoint operation depends on the length of execution that must be supported by the checkpoint operation. For short execution lengths, pure-hardware checkpoint operations are feasible, which are already used for conditional branch speculation, dependence speculation, and fault tolerance. However, if longer execution lengths are to be supported by the checkpoint/recovery operations, a pure-hardware checkpoint operation provides insufficient capacity for buffering architectural states.