1. Field of the Invention
This invention relates to audio routing systems and networks, and in particular to a system and method for transferring data over audio and video networks real-time using highly accurate and stable clocks to control the transfer, digitization, and playback processes.
2. Description of Related Art
Many systems are available to provide high quality digital and audio data transfer from one device to another over a digital network. Transferring audio data in real-time requires that the source and destination generate and consume the data at identical rates in order to avoid accumulating data or running out of data at the destination. This requires some form of clock synchronization.
Clock synchronization deals with understanding the temporal ordering of events produced by concurrent processes. It is useful for synchronizing senders and receivers of messages, controlling joint activity, and the serializing concurrent access to shared objects. The goal is that multiple unrelated processes running on different machines should be in agreement with and be able to make consistent decisions about the ordering of events in a system. One aspect of clock synchronization deals with synchronizing time-of-day clocks among groups of machines. In this case, the goal is to ensure that all machines can report the same time, regardless of how imprecise their clocks may be or what the network latencies are between the machines.
Most computers today keep track of the passage of time with a battery-backed up CMOS clock circuit, driven by a quartz resonator. This allows the timekeeping to take place even if the machine is powered off. When on, an operating system will generally program a timer circuit (for example, a Programmable Interval Timer, or PIT, in older Intel architectures and Advanced Programmable Interrupt Controller, or APIC, in newer systems) to generate an interrupt periodically (common times are 60 or 100 times per second). The interrupt service procedure simply adds one to a counter in memory. While the best quartz resonators can achieve an accuracy of one second in 10 years, they are sensitive to changes in temperature and acceleration and their resonating frequency may change as they age. The problem with maintaining a concept of time occurs when multiple entities expect each other to have the same idea of what the time is. Two watches hardly ever agree. Computers have the same problem: a quartz crystal on one computer will oscillate at a slightly different frequency than on another computer, causing the clocks to “tick” at different rates.
In systems where the devices are located nearby each other, typically a few meters, sharing a common timing signal is generally the easiest and most accurate method of synchronization. To accurately use a common timing signal, a device must be calibrated to account for the signal propagation delay from the timing source to the device. Sharing a common timing signal becomes unfeasible when the distance between devices increase or devices frequently change location. Even at moderate distances, e.g., 50 meters, a common timing signal may require significant costs for cabling and configuration. Additionally, even the smallest errors in keeping time can significantly add up over a long period. If a clock is off by just 10 parts per million, it will gain or lose almost a second a day. Thus, transmission distances will add complexity and error to the system. In general, the larger the number of hops between a computer and the original time source, the larger the error in synchronization will be.
Distributed clock synchronization attempts to mitigate the deficiencies of common timing signal synchronization. Using this approach, devices act on timing signals originating from a local clock which is synchronized to the other clocks in the system. Examples of distributed clock synchronization include devices synchronized to a GPS satellite, a PC's internal clock synchronized to an NTP time server, or a group of devices participating in the IEEE 1588 protocol. Instead of sharing timing signals directly, these devices periodically exchange information and adjust their local timing sources to match each other. GPS satellites (and now other global navigation systems) generally include three or four atomic clocks far from the source and destination locations that are monitored and controlled to be highly synchronized and traceable to national and international standards. Thus, for time synchronization, the GPS signal is received, processed by a local master clock, time server, or primary reference, and passed on to “slaves” and other devices, systems, or networks so their “local clocks” are likewise synchronized. When time information is passed on to “slaves” it is referred to as time stamping and each time pack of data is referred to as a timestamp.
Many digital audio systems are built on proprietary networks which provide clock signals along with the data which allows the destination device to slave its clock to the source device. Some open protocols, such as the Audio Engineering Society and the European Broadcast Union protocol (AES/EBU), also allow this type of synchronization by delivering the clock in the same stream as the data (self-clocking data streams).
Placing timestamps on transmitted frames can preserve packet timing relationships between the source device (transmitter) and the sink device (receiver), and thereby minimize the effects of latency and jitter over the wireless network. Latency is synonymous with delay and refers to the amount of time it takes a bit to be transmitted from source to destination. Jitter is delay that varies over time. One way to view latency is how long a system holds on to a packet. Delays are caused by distance, errors and error recovery, congestion, the processing capabilities of systems involved in the transmission, and other factors. Even if hardware-type delays are removed, the system would still have the speed-of-light delay. It takes nearly 30 ms to send a bit through a cross-country fiber-optic cable, a delay that cannot be eliminated. Delays of distance (called propagation delays) are especially critical when transmitting data to other countries (especially when considering all the equipment along the way that adds delay). Delay is also significant with satellite transmissions.
When a frame is received at the receiver, the receiver can retrieve a timestamp from the frame and release the frame to the application once the local clock reading reaches the value in the timestamp. Digital or analog audio video streams or video files usually contain some sort of explicit AV-sync timing, either in the form of interleaved video and audio data or by explicit relative time stamping of data. The processing of data must respect the relative data timing, for example, by stretching between, or interpolation of, received data. If the processing does not respect the AV-sync error, it will increase whenever data gets lost because of transmission errors or because of missing or mis-timed processing.
Advantages of timestamp-based syncing include ease in implementation, and using a single property for syncing. Disadvantages of timestamp-based syncing are based on the fact that time is a relative concept to the observer, and different machine clocks can be out of sync. Generally, in the prior art, some methods were employed to solve this: a) generate a timestamp on a single machine, which does not scale well and represents a single point of failure; or b) use logical clocks such as vector clocks. The latter being very difficult to implement. Time stamping enables correlation between multiple trace streams, and is provided by timestamp packets.
Timestamp based syncing works for client-to-master syncing but does not work as well for peer-to-peer syncing or where syncing can occur with two masters. This method is vulnerable to a single point of failure, based on whatever generates the timestamp. Furthermore, in timestamp-based syncing, time is not really related to the content of what is being synced.
On systems where clock information cannot be directly transferred, the clocking information can often be inferred from the arrival time of the data but not very accurately. This is especially true for systems using Ethernet or similar data networks, where traffic from other sources can potentially interfere with accurate packet timing. Even in cases where other traffic is not a problem, there is usually enough uncertainty in the transmission and arrival time of packets to cause the derived clock on the receive side to suffer from some jitter.
The prior art has attempted to solve the time synchronization problems through the introduction of a single atomic clock, which unfortunately has led to the problems associated with a GPS time synchronization system.
For example, in U.S. Pat. No. 7,015,848 issued to Ohashi, et al., on Mar. 21, 2006, the use of an atomic clock to produce a high quality audio signal is taught. The patent indicates that audio quality is directly related to the accuracy of the clock. A single clock is used both for recording and reproduction of the audio signal and refers to the possibility of a transmission channel between the two and the improvement obtained by using an accurate clock to eliminate signal degradation due to timing errors and delays in the transmission channel. However, the patent does not disclose the use of multiple clocks (or more accurately, multiple highly accurate clocks, such as atomic clocks) at the transmitter and receiver that would run simultaneously to reduce buffer size, eliminate the requirements of clock resynchronization, data rate control, buffer management, or any combination thereof.
In U.S. Publication No. 2011/0299641 to Barkan, et al., on Dec. 8, 2011, titled “Synchronous Network Device,” a system with multiple ports is described where each port uses time data from a “grandmaster clock.” The grandmaster clock data is cleaned up by removing jitter, and voltage swings, and smoothing leading edge variations. The grandmaster clock may be an atomic clock; however, only a single atomic clock is used, and a communication channel between the grandmaster clock and each network port is required.
In U.S. Publication No. 2011/0274192 to Wei, et al., on Nov. 20, 2011, titled “Synchronization Method and Device for Real-Time Distributed System” a real-time wireless communication system is described. A single GPS derived atomic clock time signal is used for the time signal. The clock signal is used by multiple digital signal processors that use the single clock signal to decide whether to operate synchronously or asynchronously.
When a single source is sending data to multiple destinations, the destination clocks must be adjusted to speed up or slow down the rate at which they use the data. This is because the source clock cannot be adjusted to match the multiple destinations which have varying clock rates. Complex clock synchronizing systems are used in these applications to control the data rate at each of the destinations.
Similarly, where a single destination is receiving data from multiple sources, the multiple sources must be kept synchronized to prevent overflow or underflow in the destination buffer. Prior art designs for both the single destination and/or single source applications need a back channel communication system for clock resynchronization. The present invention avoids the requirement of a back channel for communicating clock synchronization data and all of the resynchronization circuitry. Further, the distortions of the audio and video signal frequencies that are caused by clock resynchronization and data rate control are entirely eliminated.
In the prior art, in order to make sure that synchronization occurs and there is never a missing delay, upon arrival the data would be temporarily stored in a buffer. Once enough data was stored, it would be used for a single source and multiple destinations. When the single source starts sending data to each destination, the buffer would hold the digital packets as it receives them—without using them—until sometime later. When the digital packets are used faster than the rate that they are sent, the buffer will run out of stored data because its clock is not synchronized with the sending clock. This is due to the fact that the data packets are being sent at the rate of the sending clock (that is, at a give speed), and the destination clock is permitting the running of data at a rate slightly faster. Thus, the destination uses the data packets faster than the buffer can fill (an under flow condition). Basically, there is not enough data in the buffer, and eventually all the data packets get used. Alternatively, if the destination is not using the data packets fast enough, the buffer will overflow, that is, there is not enough storage space, and the system must continue to put data in as long as the data is being received. If the system is not using data at exactly the same rate as the incoming data rate, the data packets build up and eventually overflow the buffer. Buffer underflow and overflow conditions are indicative of a common problem in synchronization. One structural solution is to employ very large buffers; however, that requires predetermined knowledge of the amount of usage expected in a given transmission period. Accordingly, in prior art low latency real-time designs for audio and video streaming, it is required to have some form of control of the sending data rate from the source or the rate at which data is used at the destination.
Generally, large buffers allow system clock drift while the local clock is adjusted to match the clock of the incoming data (reference) packets. In typical systems, the synchronization is achieved by having the clock control circuitry or software attempt to maintain the buffers at half-full status, so the data is normally delayed a time equivalent to half the total of the buffer depth.
The present invention provides a method and apparatus to achieve very low latency in the data stream, while avoiding the necessity of having a direct physical clock connection to synchronize the clocks in the system.