1. Field of the Invention
The present invention relates to video processing technology. In one aspect, the present invention relates to decompression of digital video information.
2. Description of the Related Art
Because video information requires a large amount of storage space, video information is generally compressed. Accordingly, to display compressed video information which is stored, for example on a CD-ROM or DVD, the compressed video information must be decompressed to provide decompressed video information. The decompressed video information is then provided in a bit stream to a display. The decompressed bit stream of video information is typically stored as a bit map in memory locations corresponding to pixel locations on a display. The video information required to present a single screen of information on a display is called a frame. A goal of many video systems is to quickly and efficiently decode compressed video information so as to provide motion video by displaying a sequence of frames.
Standardization of recording media, devices and various aspects of data handling, such as video compression, is highly desirable for continued growth of this technology and its applications. A number of (de)compression standards have been developed or are under development for compressing and decompressing video information, such as the Moving Pictures Expert Group (MPEG) standards for video encoding and decoding (e.g., MPEG-1, MPEG-2, MPEG-3, MPEG-4, MPEG-7, MPEG-21) or the Windows Media Video compression standards (e.g., WMV9). Each of the MPEG and WMV standards are hereby incorporated by reference in its entirety as if fully set forth herein.
In general, video compression techniques include intraframe compression and interframe compression which operate to compress video information by reducing both spatial and temporal redundancy that is present in video frames. Intraframe compression techniques use only information contained within the frame to compress the frame, which is called an I-frame. Interframe compression techniques compress frames with reference to preceding and/or following frames, and are typically called predicted frames, P-frames, or B-frames. Intraframe and interframe compression techniques usually use a spatial or block-based encoding whereby a video frame is split into blocks for encoding (also referred to as a block transformation process). For example, an I-frame is split into 8×8 blocks. The blocks are coded using a discrete cosine transform (DCT) coding scheme which encodes coefficients as an amplitude of a specific cosine basis function, or some other transform (e.g., integer transform). The transformed coefficients are then quantized, which produces coefficients with non-zero amplitude levels and runs (or subsequences) of zero amplitude level coefficients. The quantized coefficients are then run-level encoded (or run length encoded) to condense the long runs of zero coefficients. The results are then entropy coded in a variable length coder (VLC) which uses a statistical coding technique that assigns codewords to values to be encoded, or using some other entropy encoding techniques, such as a Context-based Adaptive Binary Arithmetic Coding (CABAC), Context Adaptive Variable Length Coding (CAVLC) and the like. Values having a high frequency of occurrence are assigned short codewords, and those having infrequent occurrence are assigned long codewords. On the average, the more frequent shorter codewords dominate so that the code string is shorter than the original data. Thus, spatial or block-based encoding techniques compress the digital information associated with a single frame. To compress the digital information associated with a sequence of frames, video compression techniques use the P-frames and/or B-frames to exploit the fact that there is temporal correlation between successive frames. Interframe compression techniques will identify the difference between different frames and then spatially encode the difference information using DCT, quantization, run length and entropy encoding techniques, though different implementations can use different block configurations. For example, a P-frame is split into 16×16 macroblocks (e.g., with four 8×8 luminance blocks and two 8×8 chrominance blocks) and the macroblocks are compressed. Another approach is to use motion compensation techniques to approximate the motion of the whole scene or objects in the scene and/or blocks in the video frame using parameters (e.g., motion vectors) encoded in the bit-stream to approximate the pixels of the predicted frame by appropriately translating pixels of the reference frame. Regardless of whether intraframe or interframe compression techniques are used, the use of spatial or block-based encoding techniques to encode the video data means that the compressed video data has been variable length encoded and otherwise compressed using the block-based compression techniques described above.
At the receiver or playback device, the compression steps are reversed to decode the video data that has been processed with block transformations. FIG. 1 depicts a conventional system 30 for decompressing video information which includes an input stream decoding portion 35, motion decoder 38, adder 39, frame buffer 40, and display 41. Input stream decoder 35 receives a stream of compressed video information at the input buffer 31, performs variable length decoding at the VLC decoder 32, reverses the zig-zag and quantization at the inverse quantizer 33, reverses the DCT transformation at IDCT 34 and provides blocks of staticly decompressed video information to adder 39. In the motion decoding portion 38, the motion compensation unit 37 receives motion information from the VLC decoder 32 and a copy of the previous picture data (which is stored in the previous picture store buffer 36), and provides motion-compensated pixels to adder 39. Adder 39 receives the staticly decompressed video information and the motion-compensated pixels and provides decompressed pixels to frame buffer 40, which then provides the information to display 41.
Conventional approaches for handling video decompression have used a processor-based approach for executing software instruction to perform that various video decompression steps. However, the computationally intensive video decompression operations (such as the averaging and interpolation steps involved with motion compensation) require extensive processor resources, and can severely burden the system processor when implemented in a general purpose computer system. Such processor-based systems that are not able to keep up with the computational demands of such a decompression burden frequently drop entire frames to resynchronize with a real time clock signal also encoded in the video stream. While a variety of factors contribute to the challenge of obtaining timely video decompression, a significant contributing factor is the overhead associated with generating and retrieving pre-decoded reference frames in connection with motion compensation processes used with video decompression techniques (such as MPEG, WMV or H.263). With conventional processor-based approaches for handling video decompression, the motion compensation portion of the decoding process requires access to the reference frame data. The reference frame requirements can be readily accessed when there is a large memory buffer to hold a frame (e.g. VGA size of 640×480 pixels, equivalent to 307 kBytes).
On the other hand, hardware-based approaches for performing motion compensation processing require a large local memory and pose significant bus bandwidth requirements, often resulting in slower memory access speed. In particular, hardware designs typically can not retrieve the whole previous decoded frame, but instead are implemented with a processor core that fetches only whatever is needed for the current macroblock due to limitations imposed by the on-chip memory size. The resulting bus transaction activity can slow the decoding process. In addition, with typical System-on-a-chip (SoC) bus protocols, memory access bandwidth is wasted with the protocol requirements that memory accesses align to the bus width boundary and use predetermined data transfer sizes. For example, if 9 bytes of reference row data located in memory at starting address 18 are to be accessed over a bus having a bus width of 8 bytes where only 1, 2 or 4 beats of burst transfer are allowed (meaning that 8, 16 or 32 bytes of memory access are allowed), then the memory access would use two burst beats beginning at address signal 16, resulting in a bandwidth waste of approximately 43%.
Consequently, a significant need exists for reducing the processing requirements associated with decompression methods and for improving the decompression operations. Further limitations and disadvantages of conventional systems will become apparent to one of skill in the art after reviewing the remainder of the present application with reference to the drawings and detailed description which follow.