Distributed transaction systems, particularly financial trading systems, require tightly synchronized times between servers. As an example, if the clock at server A is 2 milliseconds ahead of the clock at server B, and server A posts a price of $1 for stock S at time t+2 and server B posts a market price of $2 for stock S at time t+1, then the log shows the stock S as falling in price although the stock S was actually rising in price. Precise time synchronization is also needed in other distributed transaction systems, in telecommunications, in industrial control and in military equipment. There are a number of other applications where precise estimation of one way transfer time (i.e., one way network delay) is important. For example, a financial trading system may need to know how long ago an order or price data was sent from an exchange or other source.
Suppose site A sends a message M to site B containing the time of day according to the clock of A. For B to set its own clock to match, B must estimate how much time has passed between the moment that A calculated the time of day until the moment that B read the time of day from the message M. That is, if t microseconds are required for the transfer from message M from A to B, then B should read the time out of M and add t microseconds before setting its own clock. The delay t is conventionally referred to as the time transfer latency.
One method for estimating t is for B to request a time update by sending a request message R, for which a response is then generated by A and returned to B as message M. The elapsed time between request and response is the round trip time. If the network is symmetric, B can divide the round trip time by two to determine time transfer latency. A second method involves scheduling the transmission of M from A to B to happen at precise periodic intervals.
There are at least three network time synchronization protocols that are currently widely used for network time synchronization. The first is known as the Network Time Protocol (NTP). NTP is an industry standard for enterprise and other commodity computing applications. Nearly every consumer and enterprise personal computer (PC) and computer server includes an implementation of NTP. When using NTP, a client sends requests for time to a server, which sends the time as a response.
The Inter-Range Instrumentation Group (IRIG) STD 168-98 protocol is another client-server time protocol that has been used in military systems since the 1950s. IRIG requires special cabling.
The more recent IEEE 1588 Precision Time Protocol (PTP) is a client-server protocol that is generally driven by the server, which hosts a so-called “master clock.” In a standard PTP time update, the master clock multicasts a Sync Message containing the time to a number of slave clocks residing at clients. After a short delay, the server transmits a follow-up message which contains the time that the Sync Message “hit the wire.” To the extent that the time transfer delay is caused by transmission delays internal to the master clock and network contention (i.e., two or more simultaneous attempts to access a network resource), the slave clock can be set correctly using this information. If there are network devices such as switches and routers that can cause additional delay between the master and slave clocks, PTP anticipates that they will add information about those delays to the Sync Message in transit. Such devices are called “transparent clocks” in the PTP standard.
PTP also may rely on hardware timestamping in network interface devices. The idea is that the master clock network interface device will be able to inform the master clock of the precise time the Sync Message was transmitted—for use in the follow-up message—and that the slave clock's network interface device will record the time of arrival of the first bit of the Sync Message. This is done to eliminate software generated time transfer variability caused by delays in reading and timestamping the Sync Message.
PTP also has a second transaction type that is used to calculate round trip time, and this is driven by the slave clock (i.e., client). In this transaction, the client sends a Delay Request message to the master clock and receives a response that allows the client to compute round trip delay.
However, whether using NTP, IRIG, or PTP, there are multiple causes of variable time transfer latency, which introduces an error into the time calculation at the client. In some instances, the variability is caused by cache loading, data structure setup time, and/or scheduling variability in the software. In what follows, we call this variability “cold cache” delays, meaning to include both physical cache and other components mentioned above. In the past, common practice has been to estimate these asymmetries in the one-way delay times and build those numbers into the software/hardware, or allow the end user to adjust the values manually. This has meant that the end user would attempt to measure these delays, which is often difficult to do. Further, the end user would typically assume that these delays never changed, which is often not the case. Accordingly, there is a need for a method to reduce the uncertainty associated with these factors in calculating the transfer time.