1. Field of the Invention
This present invention relates generally to video communication, and more particularly to video error concealment.
2. Description of Related Art
Video images have become an increasingly important part of global communication. In particular, video conferencing and video telephony have a wide range of applications such as desktop and room-based conferencing, video over the Internet and over telephone lines, surveillance and monitoring, telemedicine, and computer-based training and education. In each of these applications, video and accompanying audio information is transmitted across telecommunication links, including telephone lines, ISDN, DSL, and radio frequencies.
A standard video format used in video conferencing is Common Intermediate Format (CIF), which is part of the International Telecommunications Union (ITU) H.261 videoconferencing standard. The primary CIF format is also known as Full CIF or FCIF. Additional formats with resolutions higher and lower than FCIF have also been established. FIG. 1 is a table of the resolution and bit rate requirements for various video formats under the assumption that 12 bits are required to represent one pixel, according to the prior art The bit rates (in megabits per second, Mbps) shown are for uncompressed color frames.
Presently, efficient transmission and reception of video signals may require encoding and compression of video and accompanying audio data. Video compression coding is a method of encoding digital video data such that it requires less memory to store the video data and reduces required transmission bandwidth. Certain compression/decompression (CODEC) schemes are frequently used to compress video frames to reduce required transmission bit rates. Thus, CODEC hardware and software allow digital video data to be compressed into a smaller binary format than required by the original (i.e., uncompressed) digital video format.
Several conventional approaches and standards to encoding and compressing source video signals exist. Some standards are designed for a particular application such as JPEG (Joint Photographic Experts Group) for still images, and H.261, H.263, MPEG (Moving Pictures Experts Group), MPEG-2 and MPEG-4 for moving images. These coding standards, typically, use block-based motion-compensated prediction on 16×16 pixels, commonly referred to as macroblocks. A macroblock is a unit of information containing four 8×8 blocks of luminance data and two corresponding 8×8 blocks of chrominance data in accordance with a 4:2:0 sampling structure, where the chrominance data is subsampled 2:1 in both vertical and horizontal directions.
As a practicality, audio data also must be compressed, transmitted, and synchronized along with the video data. Synchronization, multiplexing, and protocol issues are covered by standards such as H.320 (ISDN-based video conferencing), H.324 (POTS-based video telephony), and H.323 (LAN or IP-based video conferencing). H.263 (or its predecessor, H.261) provides the video coding part of these standards groups.
A motion estimation and compensation scheme is one conventional method typically used for reducing transmission bandwidth requirements for a video signal. Because the macroblock is the basic data unit, the motion estimation and compensation scheme may compare a given macroblock in a current video frame with the given macroblock's surrounding area in a previously transmitted video frame, and attempt to find a close data match. Typically, a closely matched macroblock in the previously transmitted video frame is spatially offset from the given macroblock by less than a width of the given macroblock. If a close data match is found, the scheme subtracts the given macroblock in the current video frame from the closely matched, offset macroblock in the previously transmitted video frame so that only a difference (i.e., residual) and the spatial offset needs to be encoded and transmitted. The spatial offset is commonly referred to as a motion vector. If the motion estimation and compensation process is efficient, the remaining residual macroblock should contain only an amount of information necessary to describe data associated with pixels that change from the previous video frame to the current video frame and a motion vector. Thus, areas of a video frame that do not change (e.g., the background) are not encoded and transmitted.
Conventionally, the H.263 standard specifies that the motion vectors used for motion estimation and motion compensation be differentially encoded. Although differential encoding reduces data amounts required for transmission, any error in which motion vector data is lost or corrupted for one macroblock negatively impacts adjacent macroblocks. The result is a propagation of error due to the corrupted data which leads to lower video quality.
When preparing video frame information for transmission over a packet switched communication network, encoding schemes transform the video frame information, compressed by motion estimation and compensation techniques, into data packets for transmission across a communication network. Although data packets allow for greater transmission efficiency, lost, corrupted, or delayed data packets can also introduce errors resulting in video quality degradation. Alternatively, video data may be transmitted on heterogeneous communications networks in which one of the endpoints is associated with a circuit-switched network and a gateway or other packet-switched to circuit switched network bridging device is used.
Currently, lost or corrupted data packets often cause reduced video quality. Therefore, there is a need for a system and method which organizes and transmits data packets in order to conceal errors caused by data packet loss.