The present invention relates to wireless data networks and, more particularly, to a system and method for synchronizing outputs at multiple endpoints in a network which includes a wireless communication link.
While audio and video equipment has historically been connected by analog or digital point-to-point, one-way connections, an increasing portion of multimedia content is distributed over networks. For example, video and uncompressed audio may be streamed from an audio/video source in a media room or closet to a display and multiple speakers of a surround sound system in a remote room or rooms in a residence. Since it is difficult to retrofit finished structures with cabling, in many cases data, including video and audio data, is transmitted from a source to a display, speakers or other output devices over a network that includes a wireless communication link(s) utilizing low cost radio technologies such as frequency modulation and spread spectrum modulation to transport packetized digital data.
Synchronization of outputs and minimization of system latency are critical requirements for high quality audio whether or not combined with video. The human ear is sensitive to phase delay or channel-to-channel latency and multi-channel audio output with channel-to-channel latency greater than 1 microsecond (μs) is commonly described as disjointed or blurry sound. On the other hand, source-to-output delay or latency (“lip-sync”) greater than 10 milliseconds (ms) is commonly considered to be noticeable in audio-video systems. In a digital network, such as an audio/video system, a source of digital data transmits a stream of data packets to the network's end points where the data is presented. Typically, a pair of clocks at each node of the network controls the time at which a particular datum is presented and the rate at which data is processed, for examples, an analog signal is digitized or digital data is converted to an analog signal for presentation. The actual or real time that an activity, such as presentation of a video datum, is to occur is determined by “wall time,” the output of a “wall clock” at the node. A sample or media clock controls the rate at which data is processed, for example, the rate at which blocks of digital audio data introduced to a digital to analog converter.
Audio video bridging (AVB) is the common name of a set of technical standards developed by the Institute of Electrical and Electronics Engineers (IEEE) and providing specifications directed to time-synchronized, low latency, streaming services over networks. The Precision Time Protocol (PTP) specified by “IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems,” IEEE Std. 1588-2008 and adopted in IEEE 802.1AS-2011—“IEEE Standard for Local and Metropolitan Area Networks—Timing and Synchronization for Time-Sensitive Applications in Bridged Local Area Networks” describes a system enabling distributed wall clocks to be synchronized within 1 μs over seven network hops. A master clock to which the remaining distributed clocks, the slave clocks, are to be synchronized is selected either by a “best master clock” algorithm or manually. Periodically, the device comprising the master clock (the “master device”) and the device(s) comprising the slave clock(s) (the “slave device(s)”) exchange messages which include timestamps indicating the master clock's “wall time” when the respective message was either transmitted or received by the master device. The slave device notes the local wall times when the respective messages were received or transmitted by it and calculates the offset of the slave clock relative to the master clock and the network delay, the time required for the messages to traverse the network from the master device to the slave device. With repeated measurements, the frequency drift of the slave clock relative to the master clock can also be determined enabling the slave clock to be synchronized with the master clock by adjusting the slave clock's wall time for the offset and the network delay and adjusting the slave clock's frequency for any frequency drift relative to the master clock.
PTP can synchronize wall clocks of an extensive network or even plural networks, but the accuracy of PTP can be strongly influenced by the loading and exposure to interference of the wireless communication link(s). An alternative to PTP for synchronizing the wall time at plural devices of a wireless network is the Time Synchronization Function (TSF) specified in IEEE 802.11, “IEEE Standard for Information Technology—Telecommunications and Information Exchange Between Systems Local and Metropolitan Area Networks.” Every 802.11 compliant device in a network known as a basic service set (BSS) includes a TSF counter. Periodically, during a beacon interval, devices of the BSS transmit a beacon frame containing a timestamp indicating the local wall time at the transmitting device and other control information. A receiving node or slave device receiving the beacon frame synchronizes its local time by accepting the timing information in the beacon frame and setting its TSF counter to the value of the received timestamp if the timestamp indicates a wall time later than the node's TSF counter.
However, neither PTP nor TSF provide for synchronization of the media or sample clocks which control the rate at which application data is processed by the processing elements of the network's devices. The Audio/Video Bridging Transport Protocol (AVBTP) of “IEEE 1722-2011: Layer 2 Transport Protocol for Time Sensitive Applications in a Bridged Local Area Network” provides that each network end point (a device that receives or transmits data) is to recover the sample clock from data in the packetized data stream transmitted by the data source. Each data packet comprises plural application data samples, for example, audio data samples, and a time stamp indicating the wall time at which presentation of the application data in the packet is to be initiated. At each network end point, for example, an audio speaker unit, a sample clock is generated which oscillates at a frequency that enables the plural application data samples in a data packet to be presented for processing within the time interval represented by successive timestamps.
While PTP, TSF and AVBTP provide means for synchronizing distributed clocks, not all packets transmitted by a network data source, particularly packets transmitted wirelessly, reach their destinations. As packets are lost, each network end point, for example, the plural speaker units of a surround sound audio system, receives a respective aliased subsample of the timestamps and over time the clocks of the respective network endpoints will not track. What is desired, therefore, are accurate consistently synchronized sample clocks at a plurality of related network endpoints.