1. Field of the Invention
Embodiments of the invention relate generally to the implementation of a packet recovery mechanism for the robust transport of live and real-time media streams over packet-switched networks. Such media streams may consist of an audio and a video component or any combination of audio or video or other time-sensitive signals. The packet-switched network may include Internet connections and IP networks in general. More specifically, such embodiments relate to Automatic Repeat reQuest (ARQ) mechanisms optimized for robust, low-latency, and bandwidth-efficient transport of audio and video streams over packet-switched networks.
2. Description of the Related Art
Random congestion through packet-switched networks, such as the Internet, adds an unpredictable amount of jitter and packet loss to the transport of video and audio packet streams. Furthermore the most efficient video compression, variable bit-rate (VBR) coding, produces large bursts of data that further add to network congestion, compounding potential router queue overflow and the resulting packet loss. Thus, the number of packets that a network might drop and the instantaneous packet rate may fluctuate greatly from one moment to the next.
In addition to contending with packet delivery problems, maintaining low latency is a critical constraint for video conferencing and other applications having interaction between the viewer and subject. Some examples of applications where low-latency is critical are: security, where an operator may desire to control the pan/tilt/zoom of a remote camera to follow activity; and videoconferencing, to enable more fluid and natural conversations.
Automatic Repeat ReQuest (ARQ) provides a resilient and adaptable method for correcting packet loss in IP networks, especially as compared with forward error correction (FEC). ARQ is an integral quality of service (QoS) component of the ISO High-level Data Link Control (HDLC) communications standard [1] [2]. ARQ detects missing packets at a receiver and requests the transmitter to resend the missing packets. Various forms of ARQ have been applied to data packet transmission to help minimize the adverse impact of channel impairments on packetized data. Advantages of ARQ over other error correction mechanisms include its adaptability and resilience in correcting random and dynamically varying channel conditions.
The most commonly used transmission control protocol for robust packet transmission in IP networks is the Transmission Control Protocol (TCP) as described in RFC793 [3]. The United States' Advanced Research Project Agency (ARPA) first implemented TCP in the ARPANET network, the precursor to the Internet, as a mechanism for improving the reliability of packetized data transmission over otherwise unreliable network connections. TCP implements a form of positive-acknowledgement continuous ARQ, since it requires a return packet acknowledging the receipt of packets over a time window of transmitted packets. The main design goal for TCP was to provide robust transmission of data over unreliable links and in the presence of network congestion. TCP introduces variable latency and has a mechanism for throttling back transmission rates as congestion increases.
However, conventional ARQ and TCP protocols do not address the transmission requirements for real-time multi-media signals, where a packet's late arrival is equivalent to dropping that packet altogether. For real-time audio and video, packets must be rendered as a sequential isochronous data stream. Consequently, all packets must arrive before the signal is rendered and output to the user. In particular, after a video or audio segment has played out, the late arrival of an earlier missing packet can no longer be used in the signal presentation. Robust transmission for real-time multi-media streams therefore requires that, in addition to recovering any lost packets, packets must meet hard latency deadlines and follow strict sequence ordering.
None of the aforementioned art discusses ARQ techniques that limit latency in general, and certainly does not address robust transport for VBR streams, where the receiver may have to wait for a variable number of packets before it can request retransmission of missing packets to restore a stream. Audio packets must maintain a precise timing relationship with associated video packets to preserve lip-sync. The aforementioned art also does not address such issues when the media stream also includes audio packets. There is no known published work disclosing retransmission mechanisms that have been designed to preserve live media streams or that can provide assurances that recovered media packets can arrive in time and in the correct order to be properly rendered.
Forward Error Correction (FEC) provides an alternative to ARQ for the recovery of lost and corrupted packets. The Pro-MPEG Forum (www.pro-mpeg.org), an association of broadcast industry companies and professionals, has agreed upon an FEC standard for video over IP networks [5]. Pro-MPEG FEC is based in large part upon IETF RFC2733 that interleaves data packets into a two-dimensional array and generates parity packets among the packets in each row and column for providing forward error correction. The single parity packet of each row and column protects only a single packet loss from the corresponding row or column. However, interleaving data into a row and column array gives Pro-MPEG FEC the ability to protect against a contiguous loss of short sequences of packets within a media stream. This burst-drop protection is the most significant characteristic of Pro-MPEG FEC.
However, this protection comes at the cost of additional throughput overhead and significant added latency. For example, in order to protect against 100 milliseconds of contiguous packet loss, such as may occur during a dynamic rerouting or a switchover of routes when a router fails, and assuming FEC with 20% throughput overhead, Pro-MPEG FEC introduces 500 milliseconds of latency.