1. Field of the Invention
The invention is related to video processing. More particularly, the invention is related to motion compensation and motion estimation algorithms.
2. Description of the Related Art
Multimedia processing systems, such as video encoders, may encode multimedia data using encoding methods based on international standards such as MPEG-x and H.26x standards. Such encoding methods generally are directed to compressing the multimedia data for transmission and/or storage. Compression is broadly the process of removing redundancy from the data. In addition, video display systems may transcode or transform multimedia data for various purposes such as, for example, to ensure compatibility with display standards such as NTSC, HDTV, or PAL, to increase frame rate in order to reduce perceived motion blur, and to achieve smooth motion portrayal of content with a frame rate that differs from that of the display device. These transcoding methods may perform similar functions as the encoding methods for performing frame rate conversion, de-interlacing, etc.
A video signal may be described in terms of a sequence of pictures, which include frames (an entire picture), or fields (e.g., an interlaced video stream comprises fields of alternating odd or even lines of a picture). A frame may be generally used to refer to a picture, a frame or a field. Multimedia processors, such as video encoders, may encode a frame by partitioning it into blocks or “macroblocks” of, for example, 16×16 pixels. The encoder may further partition each macroblock into subblocks. Each subblock may further comprise additional subblocks. For example, subblocks of a macroblock may include 16×8 and 8×16 subblocks. Subblocks of the 8×16 subblocks may include 8×8 subblocks, and so forth. Depending on context, a block may refer to either a macroblock or a subblock.
Video encoding methods compress video signals by using lossless or lossy compression algorithms to compress each frame or blocks of the frame. Intra-frame coding refers to encoding a frame using data from that frame. Inter-frame coding refers to predictive encoding schemes such as schemes that comprise encoding a frame based on other, “reference,” frames. For example, video signals often exhibit temporal redundancy in which frames near each other in the temporal sequence of frames have at least portions that match or at least partially match each other. Encoders can take advantage of this temporal redundancy to reduce the size of encoded data.
Encoders may take advantage of this temporal redundancy by encoding a frame in terms of the difference between the frame and one or more reference frames. For example, video encoders may use motion estimation based algorithms that match blocks of the frame being encoded to portions of one or more other frames. The block of the encoded frame may be shifted in the frame relative to the matching portion of the reference frame. This shift is characterized by a motion vector. Any differences between the block and partially matching portion of the reference frame may be characterized in terms of what is referred to as a residual.
Reconstruction of the encoded frame involves a technique known as motion compensation. In motion compensation, the already decoded (reconstructed) pixels pointed to by the motion vector are added to the encoded difference or residual value resulting in the reconstructed pixels of the block. Decoding operations can also include creation of video frames between two or more already reconstructed frames. Frame rate conversion, de-interlacing and transcoding are examples of processes where decoder devices create new video data based on already reconstructed video data. These motion compensation techniques can use the encoded data, such as motion vectors and residual error, as well as the reconstructed video data for estimating the newly created frames. In addition, a a display device receiving uncompressed (or already decompressed) multimedia data may perform motion estimation and/or motion compensation techniques for transforming (e.g., frame rate conversion, de-interlacing, etc.) the multimedia data from one format to another format to be displayed.
One of the drawbacks of typical implementations of motion estimation and motion compensation schemes like block matching and optical flow is that these techniques usually estimate only one motion vector for every block or pixel. In most of the video sequences, this does not cause any problems. However, if the video sequence contains a semi-transparent overlay such as a menu, on-screen display (OSD), or logo, each block or pixel can be more efficiently represented by association with more than one motion vector. Such dual motion vector calculations greatly increase the computational complexity of the motion estimation and/or motion compensation schemes. Accordingly, a need exists for reducing complexity of searching for multiple motion vectors for the encoding and/or reconstruction of video data involving transparent overlays.