1. Field of the Invention
This invention relates to a multiprocessing method to be used for a multiprocessing system (fault tolerant computer system) including a plurality of element processor nodes.
2. Description of the Prior Art
In the multiprocessing system (redundant system) realizing a fault tolerance by multiplexing an element processor node (a processor node, a processor element, a CPU node or a node), when some of the element processor nodes break down and becomes into offline states, there is a difference depending on executing application in treatment of the damaged element processor nodes.
Referring to an example of application for a space vehicle, in a case of the computer which is mounted on a navigation and guidance control system of the rocket, when one of the element processor nodes breaks down, the computer cuts the damaged element processor node down instead of recovery so as to keep online states of the other normal element processor nodes, since its control cycle is very short (in the order of several ten milliseconds) and its operating time is also short (approximately 10 minutes). In other words, even though the element processor nodes are decreased by one, continuance of online processing is given priority over recovery of the damaged element processor node because the control cycle and the operating time are short.
On the other hand, in a computer mounted on an attitude control system of the artificial satellite, it is preferable to prevent decrease of the element processor nodes as far as possible so that the damaged element processor node is recovered, since allowable time for stopping control caused by the fault is relatively sufficient (in the order of 1˜3 seconds) and the operating time also extends from several months to several years. In other words, when one of the element processor nodes breaks down and becomes into the offline state, the computer starts a standby multiplexing system, causes the normal element processor nodes to copy a memory content for the damaged element processor node, or executes a roll back process to restart the process after going back to the point when all the element processor nodes have normally operated. In this time, the online processing should be stopped because the normal element processor nodes concern repair and recovery of the damaged element processor node.
Moreover, space planes such as the space shuttle have both characteristics of the rocket and the satellite. In the space shuttle, for example, a fivefold multiprocessing system is introduced, and the damaged element processor node is cut down and switched over to the standby multiplexing system in an orbital phase in the orbit, while recovery of the damaged element processor node is not performed in a critical phase such as launching and landing since the short control cycle is required in the same manner as the aforementioned computer mounted on the rocket.
Therefore, the exclusive multiprocessing system has been conventionally researched, developed and made practicable according to the respective application of the rocket, the artificial satellite and the space plane. Additionally, there is description as to the conventional multiprocessing system in “Fault-tolerant Multi-processor Operating System for Engineering Test Satellite-VI Attitude Control Electronics (Shunsuke Tanaka, et al.)” SANE89-40.
However, in the conventional multiprocessing system described above it is impossible to increase nor decrease the element processor node during the online processing and there is a problem in that it is difficult to lower the cost because the nodes specially designed according to the respective applications are used. Furthermore, the damaged element processor node has been cut down in the case where the control cycle and the operating time are short, but even in this case, it is naturally clear that it is more desirable to be possible to recover the damaged element processor node as the multiprocessing system.