1. Field of the Invention
This present invention relates generally to video communication, and more particularly to motion wake identification for video error concealment.
2. Description of Related Art
Video images have become an increasingly important part of global communication. In particular, video conferencing and video telephony have a wide range of applications such as desktop and room-based conferencing, video over the Internet and over telephone lines, surveillance and monitoring, telemedicine, and computer-based training and education. In each of these applications, video and accompanying audio information is transmitted across telecommunication links, including telephone lines, ISDN, DSL, and radio frequencies.
A standard video format used in video conferencing is Common Intermediate Format (CIF), which is part of the International Telecommunications Union (ITU) H.261 videoconferencing standard. Additional formats with resolutions higher and lower than CIF have also been established. FIG. 1 is a table of the resolution and bit rate requirements for various video formats under an assumption that 12 bits are required, on average, to represent one pixel. The bit rates (in megabits per second, Mbps) shown are for uncompressed color video frames.
Presently, efficient transmission and reception of video signals may require encoding and compression of video and accompanying audio data. Video compression coding is a method of encoding digital video data such that less memory is required to store the video data and a required transmission bandwidth is reduced. Certain compression/decompression (CODEC) schemes are frequently used to compress video frames to reduce required transmission bit rates. Thus, CODEC hardware and software allow digital video data to be compressed into a more compact binary format than required by the original (i.e., uncompressed) digital video format.
Several conventional approaches and standards to encoding and compressing source video signals exist. Some standards are designed for a particular application such as JPEG (Joint Photographic Experts Group) for still images and H.261, H.263, MPEG (Moving Pictures Experts Group), MPEG-2, and MPEG-4 for moving images. The coding standards for moving images, typically, use block-based motion-compensated prediction on 16×16 pixels, commonly referred to as macroblocks. In one embodiment, a macroblock is a unit of information containing four 8×8 blocks of luminance data and two corresponding 8×8 blocks of chrominance data in accordance with a 4:2:0 chroma sampling structure, where the chrominance data is subsampled 2:1 in both vertical and horizontal directions.
For applications in which audio accompanies video, as a practicality, audio data also must be compressed, transmitted, and synchronized along with the video data. Multiplexing and protocol issues are covered by standards such as H.320 (ISDN-based video conferencing), H.324 (POTS-based video telephony), and H.323 (LAN or IP-based video conferencing). H.263 (or its predecessor, H.261) provides the video coding part of these standards groups.
A motion estimation and compensation scheme is one conventional method typically used for reducing transmission bandwidth requirements for a video signal. Because the macroblock is the basic data unit, the motion estimation and compensation scheme may compare a given macroblock in a current video frame with the given macroblock's surrounding area in a previously transmitted video frame called a reference frame, and attempt to find a close data match. If a close data match is found, the scheme subtracts the given macroblock in the current video frame from the closely matched, offset macroblock in the previously transmitted reference video frame so that only a difference (i.e., residual) and the spatial offset needs to be encoded and transmitted. The spatial offset is commonly referred to as a motion vector. If the motion estimation and compensation process is efficient, the remaining residual macroblock should contain a small amount of information thereby leading to efficient compression.
Video data may be transmitted over packet switched communication networks or on heterogeneous communications networks in which one of the endpoints is associated with a circuit-switched network, and a gateway or other packet-switched to circuit switched network bridging device is used. When preparing video frame information for transmission over a packet switched communication network, encoding schemes transform the video frame information, compressed by motion estimation and compensation techniques or other compression schemes into data packets for transmission across the communication network. Data packets are sometimes lost, corrupted, or delayed which can introduce errors resulting in video quality degradation.
In particular, motion prediction errors resulting from corrupted or lost data packets tend to be persistent in motion wake regions of video frames. A motion wake region of a video frame is a region where a moving object has uncovered a part of a stationary or near-stationary background. Errors located in a given motion wake region can propagate to other regions of the video frame, increase in magnitude, and cause distracting visual artifacts.
Therefore, there is a need for a system and a method to identify macroblocks located in motion wake regions for reducing visual artifacts caused by motion prediction errors, thereby improving video quality.