As personal computers and workstations become more and more powerful, makers of mainframe computers have undertaken to provide features which cannot readily be matched by these smaller machines in order to stay viable in the marketplace. One such feature may be broadly referred to as fault tolerance which means the ability to withstand and promptly recover from hardware faults and other faults without the loss of crucial information. The central processing units of mainframe computers typically have error and fault detection circuitry, and sometimes error recovery circuitry, built in at numerous information transfer points in the logic to detect and characterize any fault which might occur.
The CPU(s) of a given am computer comprises many registers logically interconnected to achieve the ability to execute the repertoire of instructions characteristic of the computer. In this environment, the achievement of genuinely fault tolerant operation, in which recovery from a detected fault can be instituted at a point in a program immediately preceding the faulting instruction/operation, requires that one or more recent copies of all the software visible registers be maintained and constantly updated. This procedure is typically carried out by reiteratively sending copies of the registers (safestore information) to a special, dedicated memory or memory section. Sometime, two safestore memories are provided to receive and temporarily alternately store two recent, but one always more recent, copies of the software visible registers. When a fault occurs and analysis (performed, for example, by a service processor) determines that recovery is possible, the safestore information is used to reestablish the software visible registers in the CPU with the contents held recently before the fault occurred so that restart can be tried from the corresponding place in program execution.
Those skilled in the art are aware of certain drawbacks to the usual provision of safestore capability, which drawbacks directly adversely affect CPU performance. Thus, as higher levels of CPU performance are sought, the performance penalty resulting from the incorporation of safestore techniques to enhance fault tolerance must be more closely considered. One source of performance penalty experienced with the use of safestore techniques is that certain instructions which execute iteratively may have been well along in the execution of a series of operations and already obtained meaningful results when a fault is experienced. In particular, a fault often experienced in the execution of these iterative execution instructions is a simple page fault or cache miss on a required information item. In the past, a page fault during such execution resulted in the usual steps to acquire the missing page from a main memory or wherever a valid copy is available, place it in the local cache memory and eventually restart the iterative execution instruction from the beginning, thus losing the valid intermediate results already obtained.
The performance penalty can be in the hundreds of clock cycles each time an iterative execution instruction encounters a simple page fault. While this is not a design error, the resultant performance penalty is an obstacle to attaining the desired CPU speed level necessary to maintain competitiveness in the market. The subject invention is directed to the alleviation of this limitation.