In distributed systems, it is assumed that each node consults its own internal clock. Since clocks in distributed systems drift apart, they must periodically be resynchronized, that is, brought very close together in value. This resynchronization is necessary to carry out many protocols for distributed systems, e.g., Strong et al.
In a seminal article, Lamport, 21 CACM, 558-565, July 1982, "Time, Clocks, and the Ordering of Events in a Distributed System", uses the concept of one event happening before another to define a partial ordering of events. Lamport describes a protocol for extending this partial ordering to a total ordering for synchronizing events and then applying this to a system of physical clocks. This guarantees that a set of correct clocks will differ by no more than a specifiable amount.
Lamport and Melliar-Smith, "Synchronizing Clocks in the Presence of Faults", SRI Technical Reports, published July 13, 1981, describe clock resynchronization in a distributed system in which each processor is required to broadcast its time value. In turn, each processor receives the clock values of every other processor, discards extreme values, and takes an average value about which to synchronize. In order to achieve clock synchronization in the presence of f faults, Lamport requires (2f+1) processors. Clock synchronization in this context is simply the condition that clocks differ by no more than a specified upper bound.
Reference should be made to Strong et al, copending U.S. Application Ser. No. 06/485/573, filed Apr. 18, 1983, entitled "A Method for Achieving Multiple Processor Agreement Optimized for No Faults". Strong describes a method for achieving Byzantine Agreement among n processors in a reliable (f+1)-connected network with guaranteed early stopping in the absence of faults and eventual stopping for f&lt;(n/2) faults. Byzantine Agreement is a protocol which guarantees that eventually all correct processors will agree on a value. By way of contrast, clock synchronization protocols must guarantee that all correct processors agree (within a small specified margin of error) on a time.
In addition to message exchanging to achieve resynchronization, prior art protocols also have required, as previously suggested, a significant quantity of message passing. In the Lamport/Melliar-Smith case approximately n.sup.f+1 messages are exchanged, where n is the total number of processors and f is the number of tolerable faults. Also, some systems stamp nonconcurrent events with the same time.