1. Field of the Invention
The present invention relates to an error information saving apparatus of a computer, which performs saving of error information in a computer for performing a pipeline operation.
2. Description of the Related Art
Conventionally, there exit a number of computers for performing a pipeline operation as shown in FIG. 6 in order to improve their operation performances. In this pipeline operation, a first operation instruction (to be referred to as OP-1 hereinafter), for example, is fetched (abbreviated F). During the next machine cycle, this operation instruction OP-1 is decoded (abbreviated D) and the next operation instruction OP-2 is fetched at the same time. During the subsequent machine cycle, the first operation instruction OP-1 is executed (abbreviated E), the operation instruction OP-2 is decoded, and a new operation instruction OP-3 is fetched at the same time. In the subsequent cycle, the result of the executed operation instruction OP-1 is written-back (abbreviated W), and a new instruction is fetched. In such an operation system, i.e., a pipeline operation having the operation stages as described above, even when the execute stage (E) is not finished in one pipeline cycle, the same arithmetic unit can sequentially fetch and execute subsequent instructions without stopping operation cycles of an instruction pipeline.
If, for example, an error occurs in a computer for performing this pipeline operation, it is generally difficult or impossible to specify an instruction (statement) which caused the error or a destination register to be used in an operation as an object to be scanned upon recovery. This is so because, in current systems, if an error requiring activation of an "error interrupt routine" occurs during execution of the pipeline operation, execution of this error interrupt routine is started after execution of all instructions already fetched before occurrence of the error is completed in order not to cause a conflict in an operation after returning from this interrupt routine. As a result, since instructions subsequent to the one which caused the error are already executed when the error interrupt routine is activated, the contents of various registers and flags set immediately after occurrence of the error have been rewritten. Therefore, it is often considered impossible to specify the location of an error on the instruction word level.
Assume, for example, that an error occurs during execute (E) of the operation instruction OP-1 shown in FIG. 6. In this case, the status of that operation is written in a single status flag (f) 3, and the statuses of the subsequent operation instructions OP-2 and OP-3 are overwritten in the same status flag 3. Since the status written in this status flag 3 at the time the interrupt routine is activated is the one obtained when execution of the operation instruction OP-2 is finished, the status at the time of occurrence of the error cannot be specified. As a result, error recovery is often impossible in an error information acquiring system having the above arrangement.
Especially in a parallel processor system in which a plurality of different arithmetic units can be simultaneously used in the execute stage (E), there is a high possibility that the above-mentioned problem occurs. In addition, it is much more difficult to correctly recognize an order of a number of operations parallel-processed after occurrence of an error and to perform correct recovery processing.
In the case of a "floating-point operation instruction" requiring a plurality of cycles in executing it as shown in FIG. 7, recovery processing is more complicated. That is, even when a status must be referred to, a floating-point addition instruction "A" is set in a status flag in the fifth pipeline cycle. Therefore, the status of a floating-point multiplication instruction "M" cannot be set. As a result, when the status flag is referred to by an instruction "G" for referring to the status of the floating-point multiplication instruction "M", the status of the multiplication instruction "M" to be referred to is not set during a period S. As a consequence, since the pipeline processing is stopped, a time interval of hatched portions shown in FIG. 7 is wasted, and this makes it difficult to prevent a decrease in a pipeline processing efficiency.
In a conventional computer for performing the pipeline operation as described above, a possibility of successful error recovery is very low, and this results in a tendency to sacrifice the reliability of the computer to some extent in order to improve its performance. On the other hand, in a computer having a high reliability as its characteristic feature, if, for example, the execute stage of an instruction pipe-line exceeds one machine cycle, control is performed such that the instruction pipeline is stalled in correspondence with the excess so that only one instruction is constantly present on the execute stage, thereby improving the performance of the computer. As a result, there is a tendency to abandon a further improvement obtainable using the operation pipeline but to reliably specify the location of an error to enable recovery processing.
As described above, in conventional computers for performing the pipeline operation, it is very difficult to specify the location of an error on the instruction word level. In addition, when a plurality of errors occur in succession as shown in FIG. 8, i.e., if one error occurs and another occurs before all instructions already fetched are completely executed and an error interrupt routine is activated, it is very difficult to specify the number or order of these errors. Furthermore, even if it is possible to achieve a high performance of a computer by the pipeline operation, a possibility of realizing correct recovery against errors is very low. Therefore, it is difficult to obtain both a high performance and a high reliability at the same time.