Multiple applications for real-time digital video communication and multiple international standards for video coding have been, and are continuing to be, developed. Low bit rate communications, such as video telephony and conferencing for example, led to the International Telecommunication Union Standardization Sector (ITU-T) H.261 standard, which offered data rates as multiples of 64 kbps. The Motion Picture Expert Group's first standard (MPEG-1) was developed which provided picture quality comparable to that of VHS videotape. Subsequently, the 11.263, MPEG-2, and MPEG-4 standards have been promulgated.
The ITU-T H.264 is a recent coding standard that makes use of several coding tools to provide better compression performance than the existing standards described above. The ITU-T H.264 standard was jointly developed with the Advanced Video Coding (AVC) standard developed by the Motion Picture Expert Group and they are jointly maintained so that they have identical technical content. At the core of these standards is the hybrid video coding technique of block motion compensation (prediction) encoding plus transform coding of prediction error. Block motion compensation is used to remove temporal redundancy between successive pictures (frames or fields) by prediction from prior pictures, whereas transform coding is used to remove spatial redundancy within each block of both temporal and spatial prediction errors.
Traditional block motion compensation schemes basically assume that between successive pictures an object in a scene undergoes a displacement in the x- and y-directions and these displacements define the components of a motion vector. Thus an object in one picture can be predicted from the object in a prior picture by using the object's motion vector. Block motion compensation partitions a picture into blocks, treats each block as an object, and then determines its motion vector which locates the most-similar block in a prior picture. This is known as motion estimation.
In practice, this simple assumption works out satisfactorily in most cases. Block motion compensation has thus become the most widely used technique for temporal redundancy removal in video coding standards. It should be noted that periodically, pictures coded without motion compensation are inserted to avoid error propagation. These blocks encoded without motion compensation are called intra-coded, whereas blocks encoded with motion compensation are called inter-coded.
Block motion compensation methods typically decompose a picture into macroblocks where each macroblock contains four 8×8 luminance (Y) blocks plus two 8×8 chrominance (Cb and Cr or U and V) blocks. In the H.264/AVC standard, other block sizes are also used, such as 4×4. The residual (prediction error) block can then be encoded through block transformation, transform coefficient quantization, and entropy encoding. The transform of a block converts the pixel values of a block from the spatial domain into a frequency domain for quantization. This process takes advantage of de-correlation and energy compaction of transforms such as the two-dimensional discrete cosine transform (DCT) or an integer transform approximating a DCT.
Although intra-coded pictures are not encoded with motion compensation, spatial prediction for blocks in the pictures may be performed by extrapolation from already encoded portions of the picture. Typically, pictures are encoded in raster scan order of blocks, so pixels of blocks above and to the left of a current block can be used for intra prediction. Again, transformation of the prediction errors for a block can remove spatial correlations and enhance coding efficiency.
The H.264/AVC standard thus specifies several prediction modes for both inter-prediction and intra-prediction. These prediction modes include:
B_Direct:
Inter prediction applied, but no motion information is coded.
Inter—16×16:
Inter prediction applied for the whole macroblock (16×16).
Inter—16×8:
Macroblock partitioned into two 16×8 blocks; inter prediction applied to each block.
Inter—8×16:
Macroblock partitioned into two 8×16 blocks; inter prediction applied to each block.
Inter—8×8:
Macroblock partitioned into four 8×8 sub-macroblock, each sub-macroblock being partitioned into 8×4, 4×8, 4×4, or remain 8×8; inter prediction applied to each sub-macroblock.
B_Direct/P_Skip:
Inter prediction applied; neither motion information nor residual data are coded; simple repetition of macroblock of prior frame.
Intra—16×16:
Intra prediction applied for the whole macroblock (16×16)
Intra—4×4:
Macroblock partitioned into sixteen 4×4 blocks; intra prediction applied to each block
I-PCM
Macroblock sample coded without any transformation or compression
The computational complexity for selecting the optimal prediction mode is a problem for devices encoding according to the H.264/AVC standard, and can put a stain on processing. Real-time systems such as video streaming must be able to encode a nominal number of frames per second and sustain this rate of processing. Embodiments of the present application are directed toward resolving the processing strain in encoding devices using block motion compensation encoding.