The invention relates to new and useful improvements in computer systems. More particularly, the invention relates to a method for synchronization of programs on different computers in a network.
Computer systems are increasingly being networked to one another via communication devices. In some applications, particular attention has to be paid to the synchronization of the data processing systems which are interconnected in a network.
The type of synchronization may be subject to various requirements. On the one hand, the synchronism required of the computers in a network may be predominantly of a time nature. The computers should operate as nearly in parallel as possible, i.e., the processing of the instructions in a program which is running should as far as possible always be at the same processing state at any point in time. Thus, if possible, the computers should all be processing the same processing sequence at all times.
Going beyond pure temporal parallelism, the synchronism of computers in a network may be subject to the additional requirement that the respective processing sequences in the computers should also, if possible, have the same meaning content in terms of data technology. This means that the computers in a network, having approximately concurrent instruction processing based on initial values which are as identical as possible, should also achieve result values which are as identical as possible. In such a case, comparisons of original values as well as selected current processing values from the individual programs on the computers is advantageous in the context of temporal synchronization.
In practical technical applications, owing to the widely different types of disrupting influences, it is impossible for the instruction processing in different computers that are networked to one another in a network to maintain time and/or logical-content synchronism over a lengthy time period. In fact, as a rule, special technical measures have to be carried out cyclically in order to maintain or reproduce the synchronism of the instruction processing actions in the computers.
This is particularly important if the computers in a network are intended to form a so-called high-availability or fail-safe system. Examples of this are so-called "one of two" or "two of two" systems.
In the case of a "one of two" system, two connected computers with identical programs are intended to process the same original data, such as measurements, subject to the boundary condition of high computer system availability. It is necessary to ensure regularly that the processing states of the two computers do not diverge too severely in the medium term as the result of processing speeds which may differ only slightly from one another. Furthermore, it is necessary to ensure regularly that both computers have matching processing results. Specifically, in a situation in which the two computers are used for controlling a technical process which requires high availability, and one of the two computers fails, this is a precondition for the other computer to be able to continue to control the technical process virtually without any discontinuity. Thus, in the case of such a "one of two" system, the two computers involved must be synchronized in time by means of special synchronization measures. Furthermore, their current processing contents must be regularly checked for equivalence.
The conditions in a so-called "two of two" computer system are very similar to those for a "one of two" computer system. Two connected computers with identical programs are intended to process the same original data, such as measurements, subject to the boundary condition of high processing reliability. However, the desired result reliability is no longer provided if a sudden, uncorrectable non-equivalence is detected in the processing results. Thus, in the case of safety-relevant processing, the user programs on both computers must be brought to a safe state, for example the stop state. Further processing of the safety-relevant process is no longer possible since it is impossible to decide which program on which of the two computers had the still "correct" processing results at the moment when the non-equivalence suddenly occurred.
Comparable boundary conditions also exist in a so-called "two of three" system. In this case, the requirements for high system availability and high data processing reliability can be satisfied at the same time. Three connected computers can process the same original data, using identical programs. Once again, it is necessary to ensure at regular time intervals that the processing states of the three computers have not diverged too severely and that each computer is producing the same processing results. If any non-equivalences are detected during a comparison of current processing results, then one computer may be regarded as being defective, and must be excluded from the network if its current processing results suddenly and permanently deviate from the matching processing results of the two other computers. The system maintains both processing reliability and availability if one of the three computers fails. This process is also called majority decision. Following such a situation, the original "three of two" system then, in practice, reverts to a "two of two" system or a "one of two" system depending on whether the two remaining computers in the system are intended to continue processing the data for the respective programs with high result reliability or with high availability.
In the case of computer systems of the above type, regularly repeating measures must generally be used to ensure and/or reproduce the temporal parallelism of the internal processing sequence of the instructions. The measures which are required for such synchronization are intended to have as little adverse effect as possible on the normal operation of each computer involved and to run as quickly as possible.