Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless communication devices such as radio telephone handsets, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, digital cameras, digital recording devices, video gaming devices, video game consoles, and the like. Digital video devices implement video compression techniques, such as MPEG-2, MPEG-4, or H.264/MPEG-4, Part 10, Advanced Video Coding (AVC), to transmit and receive digital video more efficiently. Video compression techniques perform spatial and temporal prediction to reduce or remove redundancy inherent in video sequences.
Block-based video compression techniques may perform spatial prediction and/or temporal prediction. Intra-coding relies on spatial prediction to reduce or remove spatial redundancy between video blocks within a given coded unit, which may comprise a video frame, a slice of a video frame, or the like. In contrast, inter-coding relies on temporal prediction to reduce or remove temporal redundancy between video blocks of successive coded units of a video sequence. For intra-coding, a video encoder performs spatial prediction to compress data based on other data within the same coded unit. For inter-coding, the video encoder performs motion estimation and motion compensation to encode video information based the movement of corresponding video blocks of two or more adjacent coded units.
Video blocks may include luminance (luma) blocks and chrominance (chroma) blocks. A 16-by-16 block of pixels, for example, may be represented by four 8-by-8 luma blocks and two sub-sampled 8-by-8 chroma blocks. Block-based coding may occur with respect to each of these different video blocks. In video coding, the YCbCr color space is commonly used, in which Y represents the luma component and Cb and Cr represent two different chroma components of a block of pixels. Given a 16-by-16 block of pixels, four 8-by-8 Y blocks, one sub-sampled 8-by 8-Cb block, and one sub-sampled 8-by-8 Cr block may be used to represent the 16-by-16 block of pixels, and block based coding may occur for each of these video blocks. The term “macroblock” is sometimes used to refer to a set of four 8-by 8-Y blocks, one sub-sampled 8-by-8 Cb block, and one sub-sampled 8-by-8 Cr block that collectively define a 16-by-16 block of pixels. In some formats, macroblocks can be partitioned into other luma and chroma block sizes, and may define even finer block-partitions such as 2-by-2 blocks, 2-by-4 blocks, 4-by-2 blocks, 4-by-4 blocks, 4-by-8 blocks, 8-by-4 blocks, and so forth.
A coded video block may be represented by prediction information that can be used to create or identify a predictive block, and a residual block of data indicative of differences between the block being coded and the predictive block. In the case of inter-coding, one or more motion vectors are used to identify the predictive block of data (typically from a previous or subsequent video frame of a video sequence), while in the case of intra-coding, the prediction mode may define how the predictive block is generated based on data within the same frame or other coded unit. Both intra-coding and inter-coding may define several different prediction modes, which may define different block sizes and/or prediction techniques used in the coding. Additional types of syntax elements may also be included as part of encoded video data in order to control or define the coding techniques or parameters used in the coding process.
After block-based prediction, the video encoder may apply transform, quantization and entropy coding processes to further reduce the bit rate associated with communication of a residual block. Transform techniques may comprise discrete cosine transforms or conceptually similar processes, wavelet transforms, integer transforms, or other types of transforms. In a discrete cosine transform (DCT) process, as an example, the transform process converts a set of pixel values into transform coefficients, which may represent the energy of the pixel values in the frequency domain. Quantization is applied to the transform coefficients, and generally involves a process that limits the number of bits associated with any given transform coefficient. Entropy coding comprises one or more processes that collectively compress a sequence of quantized transform coefficients.