1. Field of the Invention
This invention relates generally to the synchronization of clocks in distributed digital systems and particularly to a method for logically decoupling monotonic amortization of synchronization errors from frequent clock synchronization events.
2. Discussion of the Related Art
In distributed computer systems, clock synchronization is necessary to coordinate the actions of different processors. Clock synchronization procedures can be generally classified as either internal or external. Internal clock synchronization procedures operate to hold processor clocks within some maximum relative deviation from each other. External clock synchronization features operate to hold each processor clock within some maximum absolute deviation from an external real-time reference.
All clock synchronization procedures include periodic, "synchronization events", which herein denominate periodic adjustments to a logical clock time. A synchronizing adjustment is valued so that a "logical" clock time, obtained by adding an "offset" to a "hardware" clock time, is synchronized to the appropriate internal or external standard. The "offset" is repeatedly "adjusted" to maintain a certain synchronization "precision" without adjusting the hardware clock.
A "discrete" adjustment causes the logical clock to instantaneously leap ahead or back, thereafter continuing to run at the "base rate" of the underlying hardware clock. Discrete logical clocks are generally neither monotonic nor continuous and many distributed applications requiring clock synchronization cannot tolerate discontinuous logical clock behavior.
The problems of discontinuity in discrete logical clocks can be overcome by "amortizing" (i.e., distributing) the adjustment continuously over a time interval by increasing or decreasing the base rate of the logical clock relative to the hardware clock during an "amortization interval".
Thus, the problem of clock synchronization in distributed systems is two-fold; a "precision" requirement is imposed on synchronization accuracy with respect to the internal or external standard, and a "smoothness" requirement is imposed for temporal monotonicity and continuity of the clock time. The fundamental problem in the art is that when clock synchronization events occur too closely to one another, amortization of the clock adjustments for earlier synchronizations may be incomplete and the remaining non-amortized difference may distort the results of later synchronization.
Until recently, it was widely believed in the art that the only sure way to overcome this problem is to impose an upper limit on the frequency of synchronization events. This wide-spread belief led directly to the contention that amortization negatively affects the precision achievable by a clock synchronization procedure. Unfortunately, imposing an upper frequency limit on synchronization events does limit precision and is thus unacceptable for clock synchronization procedures that attempt to maintain a given precision by scheduling synchronization events responsive to estimated error rather than according to a standard schedule.
Practitioners in the art have suggested several clock synchronization procedures intended to overcome the problems introduced in distributed computer systems. For instance, in U.S. Pat. No. 4,882,739, Richard J. Potash et al. discloses a method for adjusting clocks in several data processors to a common time base. Potash et al. exchange time signals between a master clock at a central location and several slave clocks at various remote locations so that the round-trip transmission time can be measured and incorporated in each synchronizing event. Also, the ratio between central and remote clock frequencies is measured and used to continuously adjust each remote clock.
Other practitioners have disclosed synchronization procedures involving round-trip transmission time measurements in a fashion similar to that of Potash et al. In U.S. Pat. No. 5,052,028, Eduard Zwack discloses a phase synchronization method for clock signals in a communications network that relies on round-trip data flow. In Canadian Patent 2,060,462, Scott E. Farleigh discloses a timing distribution method for an asynchronous fiber optic ring network wherein a master node provides timing information that is used at any ring node to synchronize the local node clock by transferring a measure of propagation delay. Japanese Patent 4-294413 discloses a similar network system time adjusting technique as does U.S. Pat. No. 5,062,124.
The clock synchronization procedures requiring accurate round-trip propagation delay data are subject to errors arising from transmission faults and several practitioners have accordingly suggested fault-tolerant master-slave clock synchronization procedures for distributed systems. For instance, F. Christian et al. (IBM Technical Disclosure Bulletin, Vol. 33, No. 8, pp. 107-109, January 1991) describe a method for convening a master-slave clock synchronization protocol into a decentralized clock synchronization protocol that accommodates multiple time signal sources and multiple processor failures. Similarly, F. Christian (IBM Technical Disclosure Bulletin, Vol. 31, No. 2, p. 91, July 1988) discloses a protocol for synchronizing the clock of a slave processor with the clock of the master processor using a variable delay between synchronization events derived from the probability of achieving the requisite precision. Christian provides a synchronization procedure for distributed systems subject to unbounded random communication delays, including indefinite delays arising from communication and clock failures. Also, co-pending patent application Ser. No. 07/970,666 filed on Nov. 3, 1992 and assigned to the Assignee hereof discloses an anonymous time synchronization method that requires few round-trip or hand-shaking messages because each network node generally synchronizes itself based on eavesdropping on anonymous messages from other nodes.
Although the above practitioners provide useful solutions to the synchronization precision problem, none suggest or consider solutions to the synchronization smoothness problem. However, in U.S. Pat. No. 4,584,643, Joseph Y. Halpern et al. disclose a decentralized synchronization method that synchronizes distributed clocks in the presence of faults, guaranteeing that no clock in a correct processor ever deviates from another such clock by more than some maximum deviation. This internal synchronization technique relies on "unforgeably signed" time messages from the several system nodes. Also, in U.S. Pat. No. 4,531,185, Halpern et al. disclose a centralized version of their clock synchronization process. Both centralized and decentralized processes impose smoothness constraints that require a correct clock never be adjusted by more than a maximum amount during a predetermined time period and that further require that a corrected clock never be set back. These requirements address the synchronization smoothness problem by forcing the local clocks to remain piece-wise linear and monotonically non-decreasing.
Until now, few practitioners have considered the combined precision and smoothness problem, primarily because of a wide-spread belief that the two problems are not compatibly resolvable. Frank Schmuck and Flaviu Christian ("Continuous Clock Amortization Need Not Affect the Precision of a Clock Synchronization Algorithm", Proc. 9th Association of Computing Machinery Symposium on Principles of Distributed Computing, pp. 133-143, 1990) for the first time describe the theoretical conditions that must be imposed on a distributed computing system to permit compatible solutions to the two clock synchronization precision and smoothness problems. Schmuck et al. provide theoretical proof that amortization can be added to existing internal and external clock synchronization processes under some circumstances without worsening synchronization precision.
Schmuck et al. demonstrate that if the amortization rate is no smaller than the maximum hardware clock drift rate, then amortization causes no loss of precision for external clock synchronization processes. Unfortunately, they also find that fast amortization alone is not sufficient to preserve precision in a discrete internal clock synchronization process without further limiting the internal process so that all clocks are resynchronized within a small real-time interval during which both old and new clocks must stay synchronized. These additional limitations require all amortization to be completed before the next synchronizing event for internal synchronization procedures, which is the general limitation already known in the art.
Thus, even the iconoclastic theoretical work of Schmuck et al. does not offer a solution to the internal synchronization problem where synchronization events must occur too frequently to allow complete amortization of previous clock adjustments. Moreover, although Schmuck et al. showed that external synchronization processes require only that the amortization rate be faster than the hardware clock drift rate, they discuss only theoretical proofs and do not disclose or suggest useful embodiments of synchronization systems that solve both precision and amortization problems.
There is accordingly a clearly felt need in the art for a clock synchronization process that permit amortization of clock adjustments to provide smooth monotonic time in external and internal clock synchronization processes without an upper synchronization frequency limit that unacceptably limits clock precision: These unresolved problems and deficiencies are clearly felt in the art and are solved by this invention in the manner described below.