More particularly, the invention applies to synchronizing replicated tasks residing in computers monitoring automatic processes (level 1 of the CIM model) in an industrial monitoring/control facility.
Providing active redundancy over the inputs of replicated tasks (i.e. duplicated in different processors) is a known technique for implementing fault-tolerance procedures. The replicated tasks take the same data at their inputs and they run the same program, so it is possible to switch over the outputs of the replicated tasks in the event that one of the replicated tasks fails. Consideration is given below only to replicated tasks that respond to input data only (which is in general supplied by transmitter tasks for transmitting such data, which transmitter tasks reside in other processors). For providing active redundancy, it is necessary for the replicated tasks to have the same behavior. This is obtained by ensuring that the replicated tasks take their input data in the same chronological order, i.e. that they are synchronized.
Document "ACM Computing Surveys -Vol. 22- No. 4 - December 1990: Implementing Fault-Tolerant Services Using State Machine Approach: A tutorial" by F. Schneider, discloses a system for synchronizing replicated tasks, in which system the time stamp contained in each message is constituted by a send time for the message. This send time is conventionally given by a clock of the processor housing the task that transmits the message. The ticks of the clocks of the processors (of the message-transmitting tasks) are short enough to ensure that it is not possible for two messages output by the same processor to have the same time stamp. The processors in which the replicated tasks reside also have local clocks, and the clocks of all the processors are optionally resynchronized, as is well known, so as to maintain an identical time reference from one processor to another. Document "Fault-Tolerant Clock Synchronization in Distributed Systems" - COMPUTER-IEEE- October 1990, describes procedures for synchronizing processor clocks.
The replicated task synchronization system disclosed in Document "ACM Computing Surveys . . . " operates as follows.
A single time constant is pre-determined on the basis of data transfer times measured between each transmitter task and each replicated task. The time constant is equal to the maximum measured transfer time.
The messages received by a second processor are placed in a queue in which they are sequenced in increasing order of time stamp. A stability time is calculated for each message, which time is equal to the sum of the time stamp plus the time constant. A message at the input of a processor is said to be "stable" when it is no longer possible for any other message having a stamp that is earlier than the stamp of the message to arrive at the input of the processor. Therefore, a stable message is detected when the clock of the second processor gives a time that is later than the stability time of the message. The message is then taken from the queue and the data contained in the message is supplied to the replicated task residing in the second processor.
This process is performed in the same way on the other second processor.
In this way, the replicated tasks take the data that they receive into account in identical chronological order as given by the time stamps of the messages encapsulating the data.
That known synchronization system suffers from the following drawback.
FIG. 1 is a diagram showing a monitoring/control facility comprising an operating station (PC) (level 2 of the CIM model) connected to automatic-process monitoring computers 3 (CA1, CA2) via a message transmission network 2 accepting a communications protocol, e.g. an aperiodic communications protocol. The automatic-process monitoring computers 3 are connected, via a message transmission network 4 accepting a communications protocol, e.g. a periodic communications protocol, to remote interfaces 5 (E/S1, E/S2) which receive data from or which transmit data to a physical process 6 to be monitored. In general, the data processing process connecting the remote interfaces to the automatic-process monitoring computers takes place in real time. The characteristics of the networks 2 and 4 are such that the message transmission times between the remote interfaces 5 and the processors 3 are considerably shorter than the message transmission times between the operating station 1 and the computers 3.
Because the time constant used for determining the stability time of each message is uniform for all the messages, the messages are delayed by the same time value on average. As a result, a first message (coming from the remote interface) which is part of a real time processing process and which has a transfer time that is short relative to the transfer time of a second message (coming from the operating station) can be delayed by a much longer time than the second message, if the second message has a stamp that is earlier than the stamp of the first message, even though the second message is received by the automatic-process monitoring computer after the first message. Therefore, that known replicated task synchronization system may cause the response times for the real time process to be exceeded, and it is therefore not suited to such a monitoring/control facility.
Document EP-A-0,445,954 also discloses a system for synchronizing concurrent tasks, which system is based on using virtual time stamps (counters). With that known system the stability times of the messages are not taken into account.