With the advent of Home networking and Internet and PC based media, devices which play digital media content have become common. Transmission between an audio source and a playback device can be analog or digital. In such systems, synchronized playback of different devices at varied locations relies on a continuous stream of digital content, or an analog signal, being received simultaneously by multiple devices. For example, a whole-house audio system can have an amplifier, located in any convenient place, which simultaneously distributes analog signals to multiple speakers throughout the house. In an office environment, a telephone system may be used for paging. In such a system, the central telephone switch activates all handsets and sends an identical audio signal to them. Analog audiovisual signals transmitted to remote speakers and/or view screens from a single source typically are perceived as simultaneous or synchronized. The signals propagate at near-light speed whether traveling on a transmission line or in free space, so that all signals appear to arrive at the same time, over relatively short distances. However, this may not be true of digital signals transmitted as packets over a network.
Due to the cost of home networking equipment, and the proliferation of computer-based digital media, it has become common to locate standard Internet protocol (IP) based network devices throughout the home for playing a variety of digital media. However, due to the structure of such IP networks, the delivery time of information (e.g., a digital audio signal) from one device to another can vary over time. Thus, even a simultaneous broadcast from a single location may not reach the intended location, or locations, within a fixed or known period of time. The nature of switching and routing on an IP or other packet network is such that a data packet may be delayed in reaching a given destination, particularly in relation to the arrival time of other packets sent at almost the same time but to other devices. For example, some packets may be delayed because of network traffic, or because packets travel different network routes to the same location. Even a Promiscuous Mode Multicast may reach multiple destinations at different times because of delays introduced in packet switching and traffic on the network. Thus, even over short distances, playback may not be synchronized. Unlike analog signals, where a perceived delay is often primarily a function of physical distance, other factors may affect the transmission and perceived delay of a digital signal.
For example, a user walking from one room to another, where an audiovisual program is being sent to and rendered by different playback devices at different network nodes, may experience undesired, overlapping, or out-of-sync renditions of the same broadcast or programming. Conventional technology may use a cache or buffer to address gaps or lags in a single rendition of a program, but this does not solve the problem of synchronizing one or more renditions of a program at different network nodes, and indeed may compound this problem. To accommodate the playback of digital media content without gaps or lags, playback devices typically implement a “buffering” mechanism in which the playback device maintains sufficient data to account for any “lag” or delay in getting additional data. Uninterrupted playback is obtainable, even though individual parts of the data stream may arrive in an erratic manner.
Another exemplary method for providing guaranteed timely delivery of network packets involves modifying the hardware infrastructure of the network. Draft standard IEEE 802.11e proposes such an implementation by disclosing a modified network protocol for streaming delivery of digital media by assuring that data will arrive in sequence and within a given time frame. However, implementing this proposed protocol requires additional hardware and networking software upgrades to a system. IEEE 802.11e is hereby incorporated by reference in its entirety.
Audiovisual (A/V) or multimedia systems typically comprise a content source, one or more signals produced from that source, one or more paths or channels for the audio and/or video signals, and devices for rendering the signals, such as a display system and/or a speaker system. In today's multimedia systems, there may be more than one visual display, and typically there are multiple speakers for audio playback. For coordinated audiovisual playback in one location, audio and video are synchronized with each other by known means. For example, a digital audiovisual source, such as content stored on a CD or DVD, or on a hard drive, is accessed and the digital media is converted to an analog signal. The analog signal is delivered simultaneously to the display and to all of the connected speakers. Because these analog signals travel at near-light speeds over very short distances, e.g., through wires or as a wireless analog broadcast, the playback at the display and the speakers is perceived as simultaneous or “in sync.” Additional analog displays and speakers can be added, for example in other rooms or locations, so long as additional wiring or wireless resources are available. Sight and sound are synchronized, e.g., between the displays and the speakers, by known means which account for differences in audio and visual rendering and perception. A/V synchronization in these systems does not depend on coordinating the speed or delivery time of the analog signal, which remains essentially the same throughout the system.
Synchronized playback and performance or rendition of audiovisual content can not be provided in this way over a digital, broadband or packet-switched network. Difficulties in achieving a synchronized playback among multiple playback devices on a heterogeneous network, comprised of devices with differing speeds and memory capacities, causes observable audio and video differences to exist between the outputs provided by the multiple playback devices. The nature of the IP network packet routing and timing, and conventional buffering mechanisms may add to the problem of providing simultaneous playback from two or more playback devices. What is needed in the art is an approach to synchronization which works within the framework of readily available network hardware and protocols, and without requiring changing the network hardware or network software. Further, missing from the art is a synchronization capability which allows multiple disparate networked media playback devices for the playback of both streaming real-time media, as well as pre-recorded media. The present invention can satisfy one or more of these needs.
The prior art has addressed network timing issues, but does not confront or provide solutions to the problem of synchronized playback at different nodes of a digital network.
U.S. Pat. No. 6,611,537 to Edens et al., issued Aug. 26, 2003 and is titled “Synchronous Network For Digital Media Streams” discloses a synchronous logical ring network which operates on existing twisted-pair telephone topologies. The network is comprised of consumer electronics devices including a DSS tuner which receives MPEG2 movie format, a multimedia PC, CD/DVD players, and broadcast FM receivers. Information propagates around the logical ring, reaching every device on each revolution around the ring. The patent discloses network devices that are full-duplex devices, i.e., transmitting and receiving information on every clock cycle. The network devices arbitrate to be elected the network data clock device; which is not a timekeeping device, but merely a regularly pulsed reference signal that synchronizes the reading of a data stream and is present in conjunction with the data stream. By synchronizing all network devices to the reference clock provided by a single device, and providing fixed frames of information at consistent time intervals, the Edens et al. patent discloses that information propagates at consistent intervals. Synchronization is achieved by the network clock device sending a synchronization marker at constant intervals to which all other devices synchronize. The fixed-length frames are divided into two independent streams; a data stream for distribution of real-time continuous digital media streams, and a system command stream.
U.S. Pat. No. 6,622,171 to Gupta et al., issued Sep. 16, 2003 and is titled “Multimedia Timeline Modification In Networked Client/Server Systems.” This disclosure is directed to streaming multimedia content over a network system from a server computer to a client computer. The client allows a user to vary the speed at which the multimedia content is rendered at the client. Time-scale modification technology is used to maintain the original pitch of any audio content, thereby maintaining its intelligibility. The speed of the time-scale modification selected by the user alters the playback speed of the streaming multimedia content. While this improves the fidelity of playback on a single device operating at different speeds, it does not synchronize the playback of two or more devices.
U.S. Pat. No. 6,598,172 to VanDeusen et al., issued Jul. 22, 2003 and is titled “System and Method for Clock Skew Compensation Between Encoder and Decoder Clocks by Calculating Drift Metric, and Using it to Modify Time-Stamps of Data Packets.” This patent discloses a computer system for encoding and decoding a series of data packets containing audio and/or video data. The patent discloses a method and a device compensating a clock skew, which can exist between the clock used by the encoder and the clock used by the decoder. The patent discloses that the decoder calculates a drift metric representing the clock skew and modifies the time-stamps of the data packets based on the drift metric. The number of audio and/or video packets waiting in a playback buffer is used to calculate a drift metric based on the growth or diminution of the queue. The decoder also adjusts the sample rate conversion to adjust the playback rate of the digital data. Each packet header includes time-stamp data representing presentation time. The time-stamp for both audio and video packets is modified using the drift metric in order to compensate for any differential in clock drift or clock skew between the encoding and decoding clocks. This modification maintains synchronization between the audio and video playback, i.e., lip sync.
All references and patents cited herein are hereby incorporated by reference, as if these references and patents were set forth in their entirety within this application.
Though dealing with network playback and timing issues, the prior art does not provide a system for rendition of a program by multiple playback devices in a manner that is perceived by the user to be synchronized or contemporaneous.