An International Telecommunication Union-Telecommunication Standardization Sector (ITU-T) recommendation H.264 and an International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) recommendation MPEG-4 Part 10 for Advanced Video Coding (MPEG-4/AVC) concern digital video codecs. The H.2641 |MPEG-4/AVC documents (hereafter simply referred to as H.264) specify both decoder operation (semantics) and compressed video representation (bitstream syntax). Due to efficient syntax and many new predictive options, a conventional H.264 video encoder produces bitstreams that provide MPEG-2 quality with a 50% lower bitrate. As such, video distribution channels such as high definition digital optical media formats (Blu-ray Disc™, HD DVD), cable (High Definition Video On Demand), satellite (DirectTV, DISH Networks™), Internet Protocol Television (IPTV), terrestrial high definition television (TV), pay TV (France, England) and mobile systems (3GPP) are deploying H.264 capable equipment. Blue-Ray is a trademark of the Blu-ray Disc Association, Tokyo, Japan. DISH Network is a registered trademark of EchoStar Satellite L.L.C., Englewood, Colo.
Two classes of predictions are used in ISO/IEC and ITU block-based hybrid predictions and transform video codecs. Inter-frame predictions and intra-frame predictions are used to remove redundancy, such that improved compression is possible. Inter-frame predictions use previously, sometimes motion compensated, decoded video frames or fields for prediction of current blocks. (Predictive) P-blocks use only one block from a previous frame or field to predict. (Bi-predictive) B-blocks use a (potentially weighted) average of predictions from two previously decoded blocks. In contrast, intra-frame predictions use previously decoded adjacent blocks within the current field or frame. Key-frames that exclusively use intra-frame predictions (i.e., I-frames) may be used as access points into a compressed bitstream for channel changes or error recovery. Intra-predictions have a significant effect upon how mismatches from approximations or errors accumulate in video and, therefore, upon the effectiveness of different approximation based memory reduction techniques.
Referring to FIG. 1, a diagram of a conventional H.264 multi-frame inter-prediction is shown. Referring to FIG. 2, a diagram of a conventional H.264 hierarchical group of pictures (GOP) with reference B-frames is shown. Picture storage memory is the most expensive element of a video decoder. Memory storage increasingly dominates decoder costs. Application memory criteria for H.264 are typically specified to be higher than for other commonly specified codecs, such as ISO MPEG-2 (ITU-T H.262) or Society of Motion Picture and Television Engineers (SMPTE) VC-1 (WMV-9, Microsoft Windows Media 9). The additional memory is used to support H.264 inter-frame predictive coding tools, such as multiple reference frames, hierarchical frames and reference B-frames, as shown in FIG. 1 and FIG. 2.
Picture-storage memory can be reduced by downsampling (i.e., reducing resolution horizontally and/or vertically), as described for MPEG-2. In practice, downsampling may be merged with the final codec block/picture reconstruction stage. For (i) MPEG-2, the IDCT (inverse discrete cosine transform) stage may be used and for (ii) H.264 and VC-1, the in-loop (i.e., de-blocking filters) may be used. Similarly, an efficient implementation may merge the upsample with the sub-pel motion compensated inter-prediction load. For MPEG-2, unlike H.264|MPEG-4/AVC, horizontal sub-sampling by a factor of two with either of the following two simple methods yields good quality: (i) downsample without filtering (i.e., drop alternate columns) and upsample with bilinear interpolation and (ii) downsample by averaging (i.e., adjacent columns) and upsample without filtering (i.e., duplication, sample and hold).