1. Field
This invention relates to methods and apparatus for decoding compressed video data where various weighted prediction methods were used for encoding the video data.
2. Background
There are spatial, temporal and signal to noise ratio (SNR) scalabilities in hybrid coders like MPEG-1, MPEG-2, MPEG-4 (collectively designated MPEG-x), H.261, H.262, H.263, and H.264 (collectively designated H.26x). In hybrid coding, temporal redundancy is removed by motion-compensated prediction (MCP). A video is typically divided into a series of groups of pictures (GOP), where each GOP begins with an intra-coded frame (I) followed by an arrangement of forward or backward predictive-coded frames (P) and bi-directional predicted frames (B). Both P-frames and B-frames are inter-frames.
B-frames provide significant reduction in bit-rate, and also provide capability of temporal scalability (i.e., bi-directional prediction could be introduced for frames in between I-frames and P-frames optionally, and the bit-stream would be playable even without the B-frames, but temporal smoothness and higher frame rate would be observed if B-frames were included in the decoding and playback). B-frames are predicted from multiple frames and can be computed from a simple average of the frames from which they are predicted. However, B-frames are also computed using weighted prediction such as a time based weighted average or a weighted average based on a parameter such as luminance. Weighted prediction places more emphasis on one of the frames or on certain characteristics of the frames and is used to more efficiently predict B-frames. Different codecs implement weighted prediction in different ways. Real Video 9 provides a 14-bit unsigned weighting factor to be multiplied by the individual forward and backward predictions, and also provides for a direct mode where temporal weights are derived based on relative temporal positions of the B-frame with respect to the two reference frames. MPEG-4, in the Simple Scalable Profile, provides for simple averaging of the past and future reference frames. Windows Media Video 9 also provides for simple averaging as in MPEG-4. H.264 weighted prediction provides for simple averaging of past and future frames, direct mode weighting based on temporal distance to past and future frames, and weighted prediction based on luminance (or other parameter) of past and future frames.
As discussed above, the different video codec implementations can each have different weighting modes, such as direct mode, luminance weighting and simple averaging, as well as different bit allocations for weighting factors. A single decoder design to handle decoding multiple types of weighted bi-directional predictive video bitstreams is desired and would result in highly efficient and less costly design of software, firmware and hardware.