In processors for servers in which reliability is important, instruction retry processing, which reruns in the hardware the instruction being processed at that time, is performed when an error is detected during the instruction processing. For example, when errors such as the following occur, the processing may continue without abnormally terminating the program by executing the instruction retry:
(1) Errors that Occur as the State of the Insides of The Hardware Temporarily Changes Due to Alpha Rays or the Like
As this error is not caused by a failure of the hardware itself, the possibility is very low that the same error would occur when the instruction is rerun. Therefore, this type of error may almost certainly be recovered by performing the instruction retry.
(2) Errors that Occur Due to Noise from the Adjacent Wiring Inside the Hardware
When a signal line inside the processor is nearly damaged by electromigration or the like, an error may occur in signal lines adjacent to that signal line.
The possibility of recovering this type of error may be increased by rerunning a single instruction, as the probability of the adjacent wiring varying at the time of rerunning is greatly decreased.
FIG. 1 is a diagram illustrating a conventional instruction retry method.
A conventional arithmetic device 100 that has an instruction retry mechanism includes an instruction execution circuit 101, an execution state control circuit 102, and a retry control circuit 103, as illustrated in FIG. 1.
The instruction execution circuit 101 fetches an arbitrary instruction from a storage device, and decodes the fetched instruction. Then, the instruction execution circuit 101 performs arithmetic operation on the basis of the decoded instruction. Moreover, when executing the instruction, the instruction execution circuit 101 sequentially signals an instruction for updating a programmable resource, and checks the existence of an error in the instruction execution. Furthermore, when the instruction execution or the resource update is completed, the instruction execution circuit 101 notifies the retry control circuit 103 of that completion.
The execution state control circuit 102 orders the instruction execution circuit 101 to cancel the instruction execution. The retry control circuit 103 determines a timing at which it would be possible for the instruction execution circuit 101 to perform an instruction retry, and controls the ON/OFF of a flag (e.g., register) that indicates the determination of performing the instruction retry. Then, when determining that it is possible to perform a retry, the execution state control circuit 102 orders the instruction execution circuit 101 to execute a single instruction.
When the instruction is for updating only one resource, the instruction execution circuit 101 signals a notification of the completion of an instruction execution and a resource update at the same time. When the instruction is for updating the resource in two or more cycles, the instruction execution circuit 101 signals a notification of completing an instruction execution and a notification of completing a resource update at different times. In this case, the retry control circuit 103 determines that it is not possible to perform a retry between the time of the completion of a resource update and the time of the completion of instruction execution.
In the above configuration, retry processing is performed as in the following.
(1) When detecting an error while executing an instruction, the instruction execution circuit 101 notifies the execution state control circuit 102 and the retry control circuit 103 of the occurrence of the error.
(2) When receiving a notification of the occurrence of the error from the instruction execution circuit 101, the execution state control circuit 102 instantly orders the instruction execution circuit 101 to cancel the instruction execution in order to prevent the updating of resources from being performed using error data.
(3) When receiving a notification of the occurrence of the error from the instruction execution circuit 101, the retry control circuit 103 determines whether it is possible to perform a retry. If it is determined that it is possible to perform a retry, the retry control circuit 103 sets a flag indicating that it is possible to perform an instruction retry, and orders the instruction execution circuit 101 to rerun the instruction.
On the other hand, (4) when receiving an order cancelling the instruction execution from the execution state control circuit 102, the instruction execution circuit 101 clears all the processing in the instruction execution circuit 101. Moreover, when the order cancelling the instruction execution from the execution state control circuit 102 is negated, the instruction execution circuit 101 reruns the instruction in accordance with the order from the retry control circuit 103.
(5) When the rerunning of the instruction is completed, the instruction execution circuit 101 notifies the retry control circuit 103 of the completion of the instruction execution.
(6) When receiving a notification of completing the instruction execution from the instruction execution circuit 101, the retry control circuit 103 resets the flag that indicates that it is possible to perform the instruction retry processing, and negates the order of the rerunning to the instruction execution circuit 101.
(7) When the order of the rerunning is negated, the instruction execution circuit 101 completes the retry processing, and resumes normal instruction execution processing.
As processors in which performance is important, a processor is proposed in which the performance is improved by concurrently processing a sequence of instructions for two or more threads.
For example, processors are proposed that use a method called “fine grained vertical multi-threading” that performs a sequence of instructions for a thread different for every cycle, or a method called “simultaneous multi-threading” that performs a sequence of instructions for two or more threads at the same time. Those methods realize the concurrent processing of a sequence of instructions for two or more threads using the instruction execution circuit.
As high performance and high reliability are required in processors for servers, both high performance in processing a sequence of instructions for two or more threads at the same time and high reliability in performing retry processing upon the occurrence of an error are required.
As a method of performing retry processing in a processor that processes a sequence of instructions for two or more threads, the following two methods are possible.
(A) A method in which only one instruction retry mechanism of a processor that processes a sequence of instructions for a single thread, as in the conventional art, is provided for the processor, wherein the mechanism is common to all the threads.
(B) A method in which an instruction retry mechanism of a processor that processes a sequence of instructions for a single thread, as in the conventional art, is provided for each thread of the processor.
In method (A), however, it is not possible to perform an instruction retry if any one of the two or more threads being processed is in a state unable to perform a retry at the time of detecting the occurrence of an error. In other words, the greater the number of threads there are, the greater the possibility that it will be determined to be not possible to perform an instruction retry. Accordingly, the success rate of retries becomes lower than that of a processor for a single thread.
In method (B), an instruction retry is performed for every thread. In other words, a sequence of instructions for threads in which no error is detected is normally performed while instruction retry processing is being performed due to the detection of an error in another thread. Accordingly, in comparison to a processor that processes a single thread, there will be an increased circuit size for the circuit while processing an instruction retry. Therefore, when there is an error due to the noise from the other wiring, the success rate of retries becomes lower than that of a processor for single thread.
In relation to the technique described in the above, in Patent Document 1 an information processing device is disclosed that achieves the instruction retry function of a high quality by configuring the device such that an instruction retry is repeatedly performed and thereby verification is made.
In Patent Document 2, an information processing device is disclosed in which a command that accesses operand data two or more times is divided into commands which each access the operand data only one time, and in which when an error has occurred during the execution, only that command is rerun.
Patent Document 1: Japanese Laid-open Patent Publication No. 2006-040174
Patent Document 2: U.S. Pat. No. 5,564,014