1. Field of the Invention
The present invention relates to fault tolerant computers, in particular, those comprising plural (two or more) operation controllers.
This application is based on Patent Application No. Hei 9-306074 filed in Japan, the contents of which are incorporated herein by reference.
2. Description of the Related Art
A conventional computer system as shown in FIG. 4 is known, which comprises plural operation controllers, and in which even if one of the operation controllers is damaged, operations can be restarted or continued. Such a system is called a "fault tolerant computer system" using the multiprocessor method. When one of the operation controllers as constituent of the computer system is damaged, outputs from all operation controllers are compared and the damaged controller is detected according to a majority decision system or the like. Then, the output of the detected damaged controller is masked or the damaged controller is separated from the system.
Japanese Patent Application, First Publication, No. Hei 1-288928 discloses an example of such a computer system, in which outputs from plural subsystems are collected to a single judgment circuit and these outputs from these subsystems are compared, and also with diagnostic information, a correct output is detected and output.
On the other hand, Japanese Patent Application, First Publication, No. Hei 6-149605 discloses a judging method which essentially uses distributed processing without using a single judgment circuit. The system according to this method does not use an intensive judgment circuit as used in the above system of No. Hei 1-288928, and thus is known as a fault tolerant computer system having tolerance even for a fault of a judgment circuit itself.
The above-described conventional fault tolerant computers have the following problems.
The first problem is that each operator as a constituent of the parallel processing system must have equal operation control functions and capabilities in conventional techniques, which causes an increase of the size, power consumption, and the weight of the system.
The above problem relating to the fault tolerant computer using a parallel structure is due to a situation in that outputs of plural operation controllers are compared and an operation controller having a transient or permanent fault is identified so as to output data which is regarded to be the most accurate to outside the operation controller. To realize such a circumstance, plural operation controllers for performing similar operational and control processes, that is, substantially equal operation controllers are necessary.
The second problem is that a system having at least a triplet structure is necessary for realizing real-time identification of an operation controller having a transient or permanent fault in conventional techniques. It causes an increase of the size, power consumption, and the weight of the system.
This is because regarding a structure including plural operation controllers, when one of them is damaged, at least a triplet structure is necessary for identifying the damaged operation controller. In contrast, with a doublet structure, real-time identification of a damaged operation controller is impossible when one of the operation controllers is damaged.
The third problem is that it is impossible in conventional techniques to dynamically perform switching between (i) an arrangement having plural operation controllers which are simultaneously operated and (ii) a stand-by redundant arrangement in which only one operation controller is operated and the other operation controllers are not operated at the same time.
The reason is that a judgment section or examination and diagnosis section for identifying a fault operation controller and for separating it from the system does not normally operate unless it always receives plural inputs.
The fourth problem is that it is also impossible in conventional techniques to dynamically perform switching between (i) an arrangement having plural operation controllers which become simultaneously operable so as to make these operation controllers perform the same operational control for realizing a multiplexed system, and (ii) an arrangement for distributed processing in which some operation controllers perform different control operations so as to distribute functions and by which operational control capability as a system is improved and damage at a single point destroying all functions is prevented.
The reason is also that a judgment section or examination and diagnosis section for identifying a fault operation controller and for separating it from the system does not normally operate unless it always receives plural inputs.