A type of control system configuration of is a distributed system. In distributed systems, multiple nodes (such as computers and controllers) cooperate by exchanging messages via a communication network. In such a distributed system, even when part of the system suffers from a fault, a remaining normal part can execute, on behalf the faulty part, tasks for which the faulty part has been responsible. Thus, a highly fault tolerant system can be achieved.
A problem with distributed systems is fault detection. Because nodes operate independently of each other, there can occur a situation in which not all the nodes share the same information on fault occurrences (such as the locations, types, timings thereof). Such information difference between nodes can cause the nodes to operate in contradiction to each other and can incur a system failure. So, in distributed control systems, it is important that all nodes can share the same information on fault occurrences.
A method by which nodes can share the same information on fault occurrences is a mutual node monitoring system. In mutual node fault monitoring methods, all the nodes independently monitor and detect node faults and the fault detection results obtained at all the nodes are exchanged thereamong via a network. Then, each node examines the collected fault detection results and makes a final fault determination. A method for making such final fault determination, for example, is to determine that there occurs a fault when the number of nodes which detects the fault exceeds a predetermined threshold (such as a majority thereof).
In general, such mutual fault monitoring and determination methods have a limitation as expressed by the below equation.The number of processes ≧3×(the number of identifiable faults)+1  (1)
In other words, at least four nodes need to join a mutual fault monitoring system if any fault identification is to be made. However, some small systems or subsystems may be configured with only three nodes. A technology is discussed which can perform fault identification even for systems with only three nodes (see Non-patent Document 1).
Non-patent Document 1: Yoneda, Kajiwara, Tsuchiya, “Dependable System”, published by Kyoritsu-Shuppan (November 2005)