The present invention relates to an information processing system and, more particularly, to an information processing system with an operation verification function and a fault detecting method with an address degeneration function.
Remarkable improvements of the performances and reliability of recent information processing systems allow a single information processing system to be applied for various purposes. No matter how remarkable may be the reliability of the information processing system, it is impossible to completely eliminate faults which may occur in the system. Accordingly, as applications of information processing systems become more sophisticated and the influence of such systems becomes more widespread, it is necessary to develop improved systems for coping with rare faults of currently operating information processing system. For this purpose, means must first be provided to detect such faults.
Many technologies to detect faults have been developed. Some examples of the technologies will be enumerated below.
(1) With duplex devices or circuits, the results of processing are compared therebetween for each step of the processing to detect errors.
(2) The hardware may be designed to represent information in the system by redundant codes. The checking utilizing the redundancy is always carried out through hardware technology.
(3) Fault detection may be performed by using a test program to check the operation in the information processing system.
In the method (1), redundant hardware is provided for each apparatus and each circuit block of the system. Each redundant circuit concurrently performs the same processing. The results of each step of the processing are compared and when the results are coincident to each other, it is determined that the apparatus or the circuit is correct in operation.
Method (2) includes such techniques as checking, residue checking, one-out-of-N checking of control signals, and the like. In the case of a data transfer path such as a memory, parity checking is usable at a relatively low cost. However, it is costly when used for logical arithmetic circuits, control circuits, and the like.
Method (3) may be further varied depending on (1) the checking initiation method used by the test program, (2) the operation level of the test program, and (3) the device used for storing the test program (at ordinary times and at checking time). Typical variations of method (3) which may be used in an information processing system of the microprogram type are as follows:
(1) The checking initiation method:
A. A method using judgment and instructions by a human. PA2 B. A method in which checking is automatically initiated at substantially constant intervals by using system software or a hardware timer. PA2 C. Software program level (instruction level) PA2 D. Microprogram level (microinstruction level) PA2 E. Storing medium requiring manual operation for checking initiation such as a magnetic tape. PA2 F. Use of a part of a magnetic disc as a part of software. PA2 G. Use of a part of the memory area of the main memory. PA2 H. Use of a part of the memory area of a control memory.
(2) The operation level of the test program:
(3) The storing device used for the test program:
When the device for storing the test program at ordinary time is different from that at checking time, the method is further varied by means for moving the test program for executing the checking functions.
In addition to the above mentioned fault detecting methods (1), (2) and (3), additional methods can also be listed. Such techniques include a method for processing and recording the information by redundancy provided in the software, or a method in which information processing is executed two times by different processing procedures and the results compared by software to check for fault occurrence.
As just mentioned, various fault checking methods coexist. The reason for this is that respective methods have advantages and disadvantages, and the information processing systems have different characteristics, respectively, and thus require a particular fault checking method which suits the characteristics of the information processing system to which the method is applied. Some typical reasons causing processing systems to require different fault detection methods are as follows:
(1) The time lapse from fault occurrence to fault detection.
(2) The cost of additional hardware required to detect faults.
(3) The system resources consumed by the fault detection function (memory capacity and processor time used for fault detection)
(4) The degree of fault detection accuracy required (fault detection rate)
Let us further consider the reasons (1)-(4) just mentioned in terms of fault detection method (1). In the case of this method, the time lapse till the fault detection is shortest, i.e. substantially zero. The hardware cost is highest. That is, this method takes a duplex system so that at least double the hardware is necessary, and in addition a comparing circuit is required. Processing ability is not materially reduced, although operation is slightly slower than for a comparable non-redundant system, because the two systems must run in parallel and time must be allowed for the comparing operation. The Fault detection is substantially perfect in this system.
From the above evaluation of method (1), it will be seen that this fault detection method (1) is suitable for use in systems where the erronous information resulting from a fault can cause severe harm or damage whereupon fault occurrence is never permissible, even if its possibility is very remote.
In an information processing system used for batch processing, when a fault is detected within 10 seconds from its occurrence, sufficient time exists in which to catch the erroneous information before it goes out of the computer room. Therefore, in the case of this system, a fault detection time of up to 10 seconds is permissible. Consequently, the above fault detecting methods (2) and (3) are preferable for this type of information processing system. From the viewpoint of hardware cost, these methods are superior to method (1).
It is as a matter of course that the higher the fault detection rate, the better the system. It is very rare, however, that use of only one method attains 100% fault detection. If possible, the cost needed for realizing such is very high. A conventional way of achieving fault detection is generally a combination of two or more of the above-mentioned methods of fault detection, since the possibility of fault occurrence is extremely low. In this case, each fault detection method is evaluated in terms of the trade off between the fault between rate and the cost for fault detection. The overall fault detection rate and the overall fault detection cost of plural fault detection methods are evaluated through the trade off among the former two items, the possibility of fault occurrence, and the adverse effects resulting from erronous information which may be produced by a fault.
Various fault detecting techniques thus far developed each have a feature of its own with respect to the above four reasons (1) to (4), and these have been dominantly used at present.