International standardization committees have been working on the specification of the coding methods and transmission formats for several compression algorithms to facilitate world wide interchange of digitally encoded audiovisual data. The Joint Photographic experts Group (JPEG) of the International Standards Organization (ISO) specified an algorithm for compression of still images. The ITU (formerly CCITT) proposed the H.261 standard for video telephony and video conference. The Motion Pictures Experts Group (MPEG) of ISO specified a first standard, MPEG-1, which is used for interactive video and provides a picture quality comparable to VCR quality. MPEG has also specified a second standard, MPEG-2, which provides audiovisual quality of both broadcast TV and HDTV. Because of the wide field of applications MPEG-2 is a family of standards with different profiles and levels.
The JPEG coding scheme could be in principal also used for coding of images sequences, sometimes described as motion JPEG. However, this intraframe coding is not very efficient because the redundancy between successive frames is not exploited. The redundancy between succeeding frames can be reduced by predictive coding. The simplest predictive coding is differential interframe coding where the difference between a current pixel of the present frame and the corresponding pixel of the previous frame is quantized, coded and transmitted. To perform such interframe prediction a frame memory for storing one or more frames is required to allow for this pixel by pixel comparison. Higher efficiency than the simple differential interframe coding can be achieved by a combination of discrete cosine transform (DCT) and interframe prediction. For so-called hybrid coding the interframe difference, which is similar to JPEG, is obtained, DCT coded and then transmitted. In order to have the same prediction at both the receiver and transmitter the decoder is incorporated into the coder. This results in a special feedback structure at the transmitter which avoids coder-decoder divergence.
Variable word length coding results in a variable bit rate which depends on image content, sequence change, etc. Transmission of the coded information over a constant rate channel requires a FIFO buffer at the output to smooth the data rate. The average video rate has to be adjusted to the constant channel rate. This is performed by controlling the quantizer according to the buffer content. If the buffer is nearly full, the quantization is made more sever and thus the coded bitrate is reduced. Conversely, if the buffer is nearly empty, the quantization is relaxed.
In general, the MPEG coding use a special predictive coding strategy. The coding starts with a frame which is not differentially coded; it is called an Intra frame (I). Then prediction is performed for coding one frame out of every M frames. This allows computation of a series of predicted frames (P), while "skipping" M-1 frames between coded frames. Finally, the "skipped" frames are coded in either a forward prediction mode, backward prediction mode, or bidirectional prediction mode. These frames are called bidirectionally interpolated (B) frames. The most efficient prediction mode, in terms of bitrate, is determined by the encoder and its selected mode is associated with the coded data. Thus the decoder can perform the necessary operations in order to reconstruct the image sequence. A main difference between MPEG-1 and MPEG-2 is that MPEG-1 has been optimized for noninterlaced (progressive) format while MPEG-2 is a generic standard for both interlaced and progressive formats. Thus, MPEG-2 includes more sophisticated prediction schemes.
In more detail, motion pictures are provided at thirty frames per second to create the illusion of continuous motion. Since each picture is made up of thousands of pixels, the amount of storage necessary for storing even a short motion sequence is enormous. As higher and higher definitions are desired, the number of pixels in each picture grows also. This means that the frame memory used to store each picture for interframe prediction also grows; current MPEG systems use about 16 megabits (MB) of reference memory for this function. Fortunately, lossy compression techniques have been developed to achieve very high data compression without loss of perceived picture quality by taking advantage of special properties of the human visual system. (A lossy compression technique involves discarding information not essential to achieve the target picture quality to the human visual system). An MPEG decoder is then required to reconstruct in real time or nearly real time every pixel of the stored motion sequence; current MPEG decoders use at least about 16 MB of frame memory for reconstruction of frames using the encoded interframe prediction data.
The MPEG standard specifies both the coded digital representation of video signal for the storage media, and the method for decoding to achieve compatibility between compression and decompression equipment. The standard supports normal speed playback, as well as other play modes of color motion pictures, and reproduction of still pictures. The standard covers the common 525- and 625-line television, personal computer and workstation display formats. The MPEG-1 standard is intended for equipment supporting continuous transfer rate of up to 1.5 Mbits per second, such as compact disks, digital audio tapes, or magnetic hard disks. The MPEG-2 standard supports bit rates from 4 Mbits/sec (Mbits) to 15 Mbits an distargeted for equipment that complies with the International Radio Consultative Committee (CCIR) recommendation 601 (CCIR-601). The MPEG standard is intended to support picture frames at a rate between 24 Hz and 30 Hz. ISO-11171 entitled "Coding for Moving Pictures and Associated Audio for digital storage medium at 1.5 Mbit/s," provides the details of the MPEG-1 standard. ISO-13838 entitled "Generic Coding of Moving Pictures and Associated Audio" provides the details of the MPEG-2 standard.
Under the MPEG standard, the picture frame is divided into a series of "Macroblock slices" (MBS), each MBS containing a number of picture areas (called "macroblocks") each covering an area of 16.times.16 pixels. Each of these picture areas is represented by one or more 8.times.8 matrices which elements are the spatial luminance and chrominance values. In one representation (4:2:2) of the macroblock, a luminance value (Y type) is provided for every pixel in the 16.times.16 pixels picture area (in four 8.times.8 "Y" matrices), and chrominance values of the U and V (i.e., blue and red chrominance) types, each covering the same 16.times.16 picture area, are respectively provided in two 8.times.8 "U" and two 8.times.8 "V" matrices. That is, each 8.times.8 U or V matrix covers an area of 8.times.16 pixels. In another representation (4:2:0), a luminance value is provided for every pixel in the 16.times.16 pixels picture area, and one 8.times.8 matrix for each of the U and V types is provided to represent the chrominance values of the 16.times.16 pixels picture area. A group of four continuous pixels in a 2.times.2 configuration is called a "quad pixel"; hence, the macroblock can also be thought of as comprising 64 quad pixels in an 8.times.8 configuration.
The MPEG standard adopts a model of compression and decompression shown in FIG. 1. As shown in FIG. 1, interframe redundancy (represented by block 101) is first removed from the color motion picture frames. To achieve interframe redundancy removal, each frame is designated either "intra" "predicted" or "interpolated" for coding purpose. Intra frames are least frequently provided, the predicted frames are provided more frequently than the intra frames, and all the remaining frames are interpolated frames. The values of every pixel in an intra frame ("I-picture") is independently provided. In a prediction frame ("P-picture"), only the incremental changes in pixel values from the last I-picture or P-picture are coded. In an interpolation frame ("B-picture"), the pixel values are coded with respect to both an earlier frame and a later frame. Again, large (16 MB) frame or reference memories are required to store frames of video to allow for this type of coding.
The MPEG standard does not require frames to be stored in strict time sequence, so that the intraframe from which a predicted frame is coded can be provided in the picture sequence either earlier or later in time from the predicted frame. By coding frames incrementally, using predicted and interpolated frames, much interframe redundancy can be eliminated which results in tremendous savings in storage requirements. Further, motion of an entire macroblock can be coded by a motion vector, rather than at the pixel level, thereby providing further data compression.
The next steps in compression under the MPEG standard remove intraframe redundancy. In the first step, represented by block 102 of FIG. 1, a 2-dimensional discrete cosine transform (DCT) is performed on each of the 8.times.8 values matrices to map the spatial luminance or chrominance values into the frequency domain.
Next, represented by block 103 of FIG. 1, a process called "quantization" weights each element of the 8.times.8 matrix in accordance with its chrominance or luminance type and its frequency. In an I-picture, the quantization weights are intended to reduce to one many high frequency components to which the human eye is not sensitive. In P- and B- pictures, which contain mostly higher frequency components, the weights are not related to visual perception. Having created many zero elements in the 8.times.8 matrix, each matrix can now be represented without information loss as an ordered list of a "DC" value, and alternating pairs of a non-zero "AC" value and a length of zero elements following the non-zero value. The list is ordered such that the elements of the matrix are presented as if the matrix is read in a zigzag manner (i.e., the elements of a matrix A are read in the order A00, A01, A10, A20, A11, A02, etc.). The representation is space efficient because zero elements are not represented individually.
Finally, an entropy encoding scheme, represented by block 104 in FIG. 1, is used to further compress the representations of the DC block coefficients and the AC value-run length pairs using variable length codes. Under the entropy encoding scheme, the more frequently occurring symbols are represented by shorter codes. Further efficiency in storage is thereby achieved.
Decompression under MPEG is shown by blocks 105-108 in FIG. 1. In decompression, the processes of entropy encoding, quantization and DCT are reversed, as shown respectively in blocks 105-107. The final step, called "absolute pixel generation" (block 108), provides the actual pixels for reproduction, in accordance to the play mode (forward, reverse, slow motion e.g.), and the physical dimensions and attributes of the display used. Again, large (16 MB) frame or reference memories are required to store frames of video to allow for this type of reproduction.
Since the steps involved in compression (coding) and decompression (decoding), such as illustrated for the MPEG standard discussed above, are very computationally intensive and require large amounts of memory, for such a compression scheme to be practical and widely accepted, the decompression processor must be designed to provide decompression in real time, and allow economical implementation using today's computer or integrated circuit technology.
Improvements in circuits, integrated circuit devices, computer systems of all types, and methods to address all the just-mentioned challenges, among others, are desirable, as described herein.