The invention relates to synchronizing local clocks in a distributed computer network. Of particular interest are computer networks that exchange information via messages that are sent on communication links between components in the network. Without restriction to a specific realization of the invention we use standard Ethernet as an illustrating example. In standard Ethernet end systems are connected via network switches via bi-directional communication links. An end system will communicate with a second end system or a group of end systems via sending a message to the switch, which switch will then relay the message to the receiving end system or end systems. Likewise end systems can be connected directly to each other via bi-directional communication links, which makes a clear differentiation between end systems and switches in certain configurations difficult. Hence, generally we use the term component to refer to a physical device that can be either end system or switch. Whether a component is said to be an end system or said to be a switch is determined by its usage rather than its physical appearance.
The clock synchronization problem is the problem of bringing the local clocks of different components into close agreement.
For fault-tolerance reasons a multitude of components can be configured that generate synchronization messages. These components that generate the synchronization messages may be distributed with a high number of intermediate components in between each other. In an illustrating example of an Ethernet network that consists out of ten switches that are connected in sequence, the components that generate the synchronization messages may be located ten hops from each other. In standard Ethernet networks, the transmission latency and transmission jitter is a function of the number of hops between any two senders. This means that the receive order of synchronization messages is not necessarily the send order of these messages. For example an end system located at the same switch as an end system A that generates synchronization messages will receive the synchronization messages from end system A likely earlier than the synchronization messages from an end system B that is placed at a switch three hops away, although end system B sends its synchronization messages earlier. Likewise it can not be concluded that the synchronization messages from end systems in close proximity are always received earlier than those from end systems that are farther away, as in standard Ethernet networks, the buffer allocation in the switches is not fully predictable at runtime.
The problem of synchronizing local clocks has a long history and many algorithms that claim synchronization of local clocks also in presence of failures are known (Byzantine clock synchronization, Lamport, L. and Melliar-Smith, P. M., ACM SIGOPS Operating Systems Review, volume 20, number 3, p. 10-16, 1986, ACM New York, N.Y., USA; Optimal clock synchronization, Srikanth, T K and Toueg, S., Journal of the ACM (JACM), volume 34, number 3, p. 626-645, 1987, ACM New York, N.Y., USA; A paradigm for reliable clock synchronization, Schneider, F. B., Department of Computer Science Technical Report TR, p. 86-735; Clock synchronization in distributed real-time systems, Kopetz, H. and Ochsenreiter, W., IEEE Transactions on Computers, volume 36, number 8, p. 933-940, 1987, IEEE Computer Society Washington, D.C., USA).
These protocols, however, are developed with either an assumption of a fully connected point to point communication infrastructure between the components to be synchronized, or to operate in a contention-free environment. This invention specifies basic building blocks for synchronization protocols that operate in co-existence with other protocols on the same physical network infrastructure. We call such protocols transparent.
Well-known transparent protocols that allow the synchronization of local clocks in Ethernet-based Networks are for example the Network Time Protocol (NTP) or the IEEE 1588 clock synchronization protocol. These protocols, however, are not fault-tolerant in the sense of fault-masking, which means that no functional service degradation is experienced, once a component fails.