Aspects of this disclosure generally relate to systems and methods for fault-tolerant synchronization protocols and in particular relate to self-stabilizing distributed-system clock synchronization protocols and systems.
Distributed systems, in which components located on networked computers communicate and coordinate their actions by passing messages, have increasingly become an integral part of many safety-critical computing applications. As such, there is a need for system designs that incorporate complex fault-tolerant resource management functions to provide globally coordinated operations with ultra-reliability. Robust clock synchronization has resultantly become a fundamental component of many fault-tolerant safety-critical distributed systems.
Most clocks employ oscillators as timekeeping elements. Such oscillators may consist of physical objects that oscillate repetitively at a constant frequency, i.e., physical oscillators. Since physical oscillators are inherently imperfect, local clocks of nodes of a distributed system, driven by these physical oscillators, do not keep perfect time and can drift with respect to real time and with respect to one another. Thus, the local clocks of the nodes must periodically be resynchronized. As a result, there is a need for a fault-tolerant system with a clock synchronization algorithm that tolerates imprecise local clocks and faulty behavior by some processes.
Prior solutions for synchronization systems have not resolved the need for an approach to perform the above functions with precision, accuracy, efficiency, or that has cross-applicability to many various system architectures. Therefore, there is a need for systems and methods that address one or more of the deficiencies described above.