In video encoding, a frame of a video sequence may be partitioned into rectangular regions or blocks. A video block may be encoded in Intra-mode (I-mode) or Inter-mode (P-mode).
FIG. 1 shows a diagram of a prior art video encoder for the I-mode. An encoder may be configured to partition a frame into a plurality of blocks and encode each of the blocks separately. As an example, the encoder may partition the frame into a plurality of 16×16 “macroblocks” that include sixteen rows of pixels and sixteen columns of pixels. Macroblocks may comprise a grouping of sub-partition blocks (referred to herein as “blocks”). As an example, a 16×16 macroblock may contain sixteen 4×4 blocks, or other size sub-partition blocks.
In FIG. 1, a spatial predictor 102 forms a predicted block 103 from video block 100 by using pixels from neighboring blocks in the same frame. The neighboring blocks used for prediction may be specified by a prediction mode 101. A summer 104 computes the prediction error 106, i.e., the difference between the image block 100 and the predicted block 103. Transform module 108 projects the prediction error 106 onto a set of basis or transform functions. In typical implementations, the transform functions can be derived from the discrete cosine transform (DCT), Karhunen-Loeve Transform (KLT), or any other transforms. For example, a set of transform functions can be expressed as {f0, f1, f2, . . . , fN}, where each fn denotes an individual transform function.
The transform module 108 outputs a set of transform coefficients 110 corresponding to the weights assigned to each of the transform functions. For example, a set of coefficients {c0, c1, c2, . . . , cN} may be computed, corresponding to the set of transform functions {f0, f1, f2, . . . , fN}. The transform coefficients 110 are subsequently quantized by quantizer 112 to produce quantized transform coefficients 114. The quantized coefficients 114 and prediction mode 101 may be transmitted to the decoder.
FIG. 1A depicts a video decoder for the I-mode. In FIG. 1A, quantized coefficients 1000 are provided by the encoder to the decoder, and supplied to the inverse transform module 1004. The inverse transform module 1004 reconstructs the prediction error 1003 based on the coefficients 1000 and the fixed set of transform functions, e.g., {f0, f1, f2, . . . , fN}. The prediction mode 1002 is supplied to the inverse spatial prediction module 1006, which generates a predicted block 1007 based on pixel values of already decoded neighboring blocks. The predicted block 1007 is combined with the prediction error 1003 to generate the reconstructed block 1010. The difference between the reconstructed block 1010 and the original block 100 in FIG. 1 is known as the reconstruction error.
An example of a spatial predictor 102 in FIG. 1 is herein described with reference to section 8.3.1 of ITU-T Recommendation H.264, published by ITU—Telecommunication Standardization Sector in March 2005 (hereinafter “H.264-2005”). In H.264-2005, a coder offers 9 prediction modes for prediction of 4×4 blocks, including DC prediction (Mode 2) and 8 directional modes, labeled 0 through 8, as shown in FIG. 2. Each prediction mode specifies a set of neighboring pixels for encoding each pixel, as illustrated in FIG. 3. In FIG. 3, the pixels from a to p are to be encoded, and neighboring pixels A to L and X are used for predicting the pixels a to p. If, for example, Mode 0 is selected, then pixels a, e, i and m are predicted by setting them equal to pixel A, and pixels b, f, j and n are predicted by setting them equal to pixel B, etc. Similarly, if Mode 1 is selected, pixels a, b, c and d are predicted by setting them equal to pixel I, and pixels e, f, g and h are predicted by setting them equal to pixel J, etc. Thus, Mode 0 is a predictor in the vertical direction; and Mode 1 is a predictor in the horizontal direction.
It has been noted that oftentimes a 16×16 macroblock contains 4×4 blocks all encoded using the same prediction mode. It would be desirable to provide an efficient way to signal to a decoder that all blocks in a macroblock are encoded using the same prediction mode.