A compression standard referred to as MPEG (Moving Pictures Experts Group) compression is a set of methods for compression and decompression of full motion video images which uses the interframe and intraframe compression techniques described above. MPEG compression uses both motion compensation and discrete cosine transform (DCT) processes, among others, and can yield compression ratios of more than 30:1.
The two predominant MPEG standards are referred to as MPEG-1 and MPEG-2. The MPEG-1 standard generally concerns frame data reduction using block-based motion compensation prediction (MCP), which generally uses temporal differential pulse code modulation (DPCM). The MPEG-2 standard is similar to the MPEG-1 standard, but includes extensions to cover a wider range of applications, including interlaced digital video such as high definition television (HDTV).
Interframe compression methods such as MPEG are based on the fact that, in most video sequences, the background remains relatively stable while action takes place in the foreground. The background may move, but large portions of successive frames in a video sequence are redundant. MPEG compression uses this inherent redundancy to encode or compress frames in the sequence.
An MPEG stream includes three types of pictures, referred to as the Intra (I) frame, the Predicted (P) frame, and the Bi-directional Interpolated (B) frame. The I or Intraframes contain the video data for the entire frame of video and are typically placed every 10 to 15 frames. Intraframes provide entry points into the file for random access, and are generally only moderately compressed. Predicted frames are encoded with reference to a past frame, i.e., a prior Intraframe or Predicted frame. Thus P frames only include changes relative to prior I or P frames. In general, Predicted frames receive a fairly high amount of compression and are used as references for future Predicted frames. Thus, both I and P frames are used as references for subsequent frames. Bi-directional pictures include the greatest amount of compression and require both a past and a future reference in order to be encoded. Bi-directional frames are never used as references for other frames.
In general, for the frame(s) following a reference frame, i.e., P and B frames that follow a reference I or P frame, only small portions of these frames are different from the corresponding portions of the respective reference frame. Thus, for these frames, only the differences are captured, compressed and stored. The differences between these frames are typically generated using motion vector estimation logic, as discussed below.
When an MPEG encoder receives a video file, the MPEG encoder generally first creates the I frames. The MPEG encoder may compress the I frame using an intraframe compression technique. The MPEG encoder divides respective frames into a grid of 16.times.16 pixel squares called macroblocks in order to perform motion estimation/compensation. Thus, for a respective target picture or frame, i.e., a frame being encoded, the encoder searches for an exact, or near exact, match between the target picture macroblock and a block in a neighboring picture referred to as a search frame. For a target P frame the encoder searches in a prior I or P frame. For a target B frame, the encoder searches in a prior or subsequent I or P frame. When a match is found, the encoder transmits a vector movement code or motion vector. The vector movement code or motion vector only includes information on the difference between the search frame and the respective target picture. The blocks in target pictures that have no change relative to the block in the reference picture or I frame are ignored. Thus the amount of data that is actually stored for these frames is significantly reduced.
After motion vectors have been generated, the encoder then encodes the changes using spatial redundancy. Thus, after finding the changes in location of the macroblocks, the MPEG algorithm further calculates and encodes the difference between corresponding macroblocks. Encoding the difference is accomplished through a math process referred to as the discrete cosine transform or DCT. This process divides the macroblock into four sub blocks, seeking out changes in color and brightness. Human perception is more sensitive to brightness changes than color changes. Thus the MPEG algorithm devotes more effort to reducing color data than brightness.
Therefore, MPEG compression is based on two types of redundancies in video sequences, these being spatial, which is the redundancy in an individual frame, and temporal, which is the redundancy between consecutive frames. Spatial compression is achieved by considering the frequency characteristics of a picture frame. Each frame is divided into non-overlapping blocks, and each block is transformed via the discrete cosine transform (DCT). After the transformed blocks are converted to the "DCT domain", each entry in the transformed block is quantized with respect to a set of quantization tables. The quantization step for each entry can vary, taking into account the sensitivity of the human visual system (HVS)} to the frequency. Since the HVS is more sensitive to low frequencies, most of the high frequency entries are quantized to zero. In this step where the entries are quantized, information is lost and errors are introduced to the reconstructed image. Run length encoding is used to transmit the quantized values. To further enhance compression, the blocks are scanned in a zig-zag ordering that scans the lower frequency entries first, and the non-zero quantized values, along with the zero run lengths, are entropy encoded.
When an MPEG decoder receives an encoded stream, the MPEG decoder reverses the above operations. Thus the MPEG decoder performs inverse scanning to remove the zig zag ordering, inverse quantization to de-quantize the data, and the inverse DCT to convert the data from the frequency domain back to the pixel domain. The MPEG decoder also performs motion compensation using the transmitted motion vectors to recreate the temporally compressed frames.
When frames are received which are used as references for other frames, such as I or P frames, these frames are decoded and stored in memory. When a temporally compressed or encoded frame is received, such as a P or B frame, motion compensation is performed on the frame using the prior decoded I or P reference frames. The temporally compressed or encoded frame, referred to as a target frame, will include motion vectors which reference blocks in prior decoded I or P frames stored in the memory. The MPEG decoder examines the motion vector, determines the respective reference block in the reference frame, and accesses the reference block pointed to by the motion vector from the memory.
A typical MPEG decoder includes motion compensation logic which includes local or on-chip memory. The MPEG decoder also includes an external memory which stores prior decoded reference frames. The MPEG decoder accesses the reference frames or anchor frames stored in the external memory in order to reconstruct temporally compressed frames. The MPEG decoder also typically stores the frame being reconstructed in the external memory.
An MPEG decoder system also typically includes transport logic which operates to demultiplex received data into a plurality of individual multimedia streams. An MPEG decoder system also generally includes a system controller which controls operations in the system and executes programs or applets.
Prior art MPEG video decoder systems have generally used a frame store memory for the MPEG decoder motion compensation logic which stores the reference frames or anchor frames as well as the frame being reconstructed. Prior art MPEG video decoder systems have also generally included a separate memory for the transport and system controller functions. It has generally not been possible to combine these memories, due to size limitations. For example, current memory devices are fabricated on an 4 Mbit granularity. In prior art systems, the memory requirements for the transport and system controller functions as well as the decoder motion compensation logic would exceed 16 Mbits of memory, thus requiring 20 or 24 Mbits of memory. This additional memory adds considerable cost to the system.
The amount of memory is a major cost item in the production of video decoders. Thus, it is desired to reduce the memory requirements of the decoder system as much as possible to reduce its size and cost. Since practical memory devices are implemented using particular convenient discrete sizes, it is important to stay within a particular size if possible for commercial reasons. For example, it is desired to keep the memory requirements below a particular size of memory, such as 16 Mb, since otherwise a memory device of 20 or 24 Mb would have to be used, resulting in greater cost and extraneous storage area. As mentioned above, it has heretofore not been possible to combine the memory required for the transport and system controller functions with the memory required for the MPEG decoder logic due to the memory size requirements.
Therefore, a new video decoder system and method is desired which efficiently uses memory and combines the memory subsystem for reduced memory requirements and hence reduced cost.