One or more aspects relate, in general, to coordinated timing networks, and in particular, to recovery within such networks.
A Coordinated Timing Network (CTN) is a network in which multiple distinct computing systems maintain time synchronization to form the Coordinated Timing Network. Systems in the Coordinated Timing Network employ a message based protocol, referred to as a Server Time Protocol (STP), to pass timekeeping information between the systems over existing, high-speed data links. This enables the time of day (TOD) clocks at each system to be synchronized to the accuracy required in today's high-end computing systems. Since the protocol makes use of technology within a computing system, synchronization accuracy scales as technology improves. A computing system that provides time to other computing systems is referred to as a time server or server herein.
Within a Coordinated Timing Network for STP, there is to be only one server acting as the source of time for the network. If there is more than one time source, the two sources could diverge leading to data integrity exposure. Likewise, if there is no single server acting as the source of time for the network, the clocks on the multiple servers could drift apart, raising a data integrity exposure in that way.
The Server Time Protocol defines a primary time server (PTS) and a backup time server (BTS). Should the primary time server fail in some way, the backup time server takes over as the source of time for the network. However, the takeover process is a very complicated decision. A loss of communication does not necessarily mean that the server is no longer available, but rather it may be a result of a failed communication link.