Data compression is widely used in a variety of applications to reduce consumption of data storage space, transmission bandwidth, or both. Example applications of data compression include visible or audible media data coding, such as digital video, image, speech, and audio coding. Digital video coding, for example, is used in a wide range of devices, including digital televisions, digital direct broadcast systems, wireless communication devices, personal digital assistants (PDAs), laptop or desktop computers, digital cameras, digital recording devices, video gaming devices, cellular or satellite radio telephones, or the like. Digital video devices implement video compression techniques, such as MPEG-2, MPEG-4, or H.264/MPEG-4 Advanced Video Coding (AVC), to transmit and receive digital video more efficiently.
In general, video compression techniques perform spatial prediction, motion estimation and motion compensation to reduce or remove redundancy inherent in video data. In particular, intra-coding relies on spatial prediction to reduce or remove spatial redundancy in video within a given video frame. Inter-coding relies on temporal prediction to reduce or remove temporal redundancy in video within adjacent frames. For inter-coding, a video encoder performs motion estimation to track the movement of matching video blocks between two or more adjacent frames. Motion estimation generates motion vectors, which indicate the displacement of video blocks relative to corresponding video blocks in one or more reference frames. Motion compensation uses the motion vector to generate a prediction video block from a reference frame. After motion compensation, a residual video block is formed by subtracting the prediction video block from the original video block.
A video encoder then applies a transform followed by quantization and lossless statistical coding processes to further reduce the bit rate of the residual block produced by the video coding process. In some instances, the applied transform comprises a discrete cosine transform (DCT) applied in the horizontal and vertical directions separately. Typically, the DCT is applied to video blocks whose size is a power of two, such as a video block that is 4 pixels high by 4 pixels wide (which is often referred to as a “4×4 video block”). Often, the DCT is a one-dimensional or linear DCT, which is applied first to the rows of the video block and then to the columns of the video block. These one-dimensional (1D) DCTs may therefore be referred to as 4-point DCTs in that these DCTs are applied to 4×4 video blocks to produce a 4×4 matrix of DCT coefficients. The 4×4 matrix of DCT coefficients produced from applying a 4-point DCT to the residual block then undergo quantization and lossless statistical coding processes (commonly known as “entropy coding” processes) to generate a bitstream. Examples of statistical coding processes include context-adaptive variable length coding (CAVLC) or context-adaptive binary arithmetic coding (CABAC). A video decoder receives the encoded bitstream and performs lossless decoding to decompress residual information for each of the blocks. Using the residual information and motion information, the video decoder reconstructs the encoded video.