1. Technical Field of the Invention
The present invention relates to motion picture compression circuits for pictures such as television pictures, and more particularly to a compression circuit complying with H.261 and MPEG standards.
2. Description of Related Art
FIGS. 1A-1C schematically illustrate three methods for compressing motion pictures in accordance with H.261 and MPEG standards. According to H.261 standards, pictures may be of intra or predicted type. According to MPEG standards, the pictures can also be of bidirectional type.
Intra (“I”) pictures are not coded with reference to any other pictures. Predicted (“P”) pictures are coded with reference to a past intra or past predicted picture. Bidirectional (“B”) pictures are coded with reference to both a past picture and a following picture.
FIG. 1A illustrates the compression of an intra picture I1. Picture I1 is stored in a memory area M1 before being processed. The pictures have to be initially stored in a memory since they arrive line by line whereas they are processed square by square, the size of each square being generally 16 by 16 pixels. Thus, before starting to process picture I1, memory area M1 must be filled with at least 16 lines.
The pixels of a 16 by 16-pixel square are arranged in a so-called “macroblock”. A macroblock includes four 8 by 8-pixel luminance blocks and two or four 8 by 8-pixel chrominance blocks. The processes hereinafter described are carried out by blocks of 8 by 8 pixels.
The blocks of each macroblock of picture I1 are submitted at 10 to a discrete cosine transform (DCT) followed at 11 by a quantization (Q). A DCT transforms a matrix of pixels (a block) into a matrix whose upper left corner coefficient tends to have a relatively high value. The other coefficients rapidly decrease as the position moves downwards to the right. Quantization involves dividing the coefficients of the matrix so transformed, such that a large number of coefficients which are a distance away from the upper left corner are cancelled.
At 12, the quantified matrices are subject to zigzag scanning (ZZ) and to run/level coding (RLC). Zigzag scanning has the consequence of improving the chances of consecutive series of zero coefficients, each of which is preceded by a non-zero coefficient. The run/level coding mainly includes replacing each series from the ZZ scanning with a pair of values, one representing the number of successive zero coefficients and the other representing the first following non-zero coefficient.
At 13, the pairs of values from the RLC are subject to variable length coding (VLC) that includes replacing the more frequent pairs with short codes and replacing the less frequent pairs with long codes, with the aid of correspondence tables defined by the H.261 and MPEG standards. The quantification coefficients can be varied from one block to the next by multiplication by a quantization coefficient. That quantization coefficient is inserted during variable length coding in headers preceding the compressed data corresponding to macroblocks.
Macroblocks of an intra picture are used to compress macroblocks of a subsequent picture of predicted or bidirectional type. Thus, decoding of a predicted or bidirectional picture is likely to be achieved from a previously decoded intra picture. This previously decoded intra picture does not exactly correspond to the actual picture initially received by the compression circuit, since this initial picture is altered by the quantification at 11. Thus, the compression of a predicted or intra picture is carried out from a reconstructed intra picture I1 rather than from the real intra picture I1, so that decoding is carried out under the same conditions as encoding.
The reconstructed intra picture I1r is stored in a memory area M2 and is obtained by subjecting the macroblocks provided by the quantification 11 to a reverse processing, that is, at 15 an inverse quantification (Q−1) followed at 16 by an inverse DCT (DCT−1).
FIG. 1B illustrates the compression of a predicted picture P4. The predicted picture P4 is stored in a memory area M1. A previously processed intra picture I1r has been reconstructed in a memory area M2.
The processing of the macroblocks of the predicted picture P4 is carried out from so-called predictor macroblocks of the reconstructed picture I1r. Each macroblock of picture P4 (reference macroblock) is subject to motion estimation (ME) at 17 (generally, the motion estimation is carried out only with the four luminance blocks of the reference macroblocks).
This motion estimation includes searching in a window of picture I1r for a macroblock that is nearest, or most similar to the reference macroblock. The nearest macroblock found in the window is the predictor macroblock. Its position is determined by a motion vector V provided by the motion estimation. The predictor macroblock is subtracted at 18 from the current reference macroblock. The resulting difference macroblock is subjected to the process described with relation to FIG. 1A.
Like the intra pictures, the predicted pictures serve to compress other predicted pictures and bidirectional pictures. For this purpose, the predicted picture P4 is reconstructed (P4r) in a memory area M3 by an inverse quantification at 15, inverse DCT at 19, and addition at 19 of the predictor macroblock that was subtracted at 18.
The vector V provided by the motion estimation 17 is inserted in a header preceding the data provided by the variable length coding of the currently processed macroblock.
FIG. 1C illustrates the compression of a bidirectional picture B2. Bidirectional pictures are provided for in MPEG standards only. The processing of the bidirectional pictures differs from the processing of predicted pictures in that the motion estimation 17 consists in finding two predictor macroblocks in two pictures I1r and P4r, respectively, that were previously reconstructed in memory areas M2 and M3. Generally, pictures I1r and P4r respectively correspond to a picture preceding the bidirectional picture that is currently processed and to a picture following the bidirectional picture.
At 20, the mean value of the two obtained predictor macroblocks is calculated and is subtracted at 18 from the currently processed macroblock.
The bidirectional picture is not reconstructed because it is not used to compress another picture.
The motion estimation 17 provides two vectors V1 and V2 indicating the respective positions of the two predictor macroblocks in pictures I1r and P4r with respect to the reference macroblock of the bidirectional picture. Vectors V1 and V2 are inserted in a header preceding the data provided by the variable length coding of the currently processed macroblock.
In a predicted picture, an attempt is made to find a predictor macroblock for each reference macroblock. However, in some cases, using the predictor macroblock that is found may provide a smaller compression rate than that obtained by using an unmoved predictor macroblock (zero motion vector), or even smaller than the simple intra processing of the reference macroblock. Thus, depending upon these cases, the reference macroblock is submitted to either predicted processing with the vector that is found, predicted processing with a zero vector, or intra processing.
In a bidirectional picture, an attempt is made to find two predictor macroblocks for each reference macroblock. For each of the two predictor macroblocks, the process providing the best compression rate is determined, as indicated above with respect to a predicted picture. Thus, depending on the result, the reference macroblock is submitted to either bidirectional processing with the two vectors, predicted processing with only one of the vectors, or intra processing.
Thus, a predicted picture and a bidirectional picture may contain macroblocks of different types. The type of a macroblock is also data inserted in a header during variable length coding. According to MPEG standards, the motion vectors can be defined with an accuracy of half a pixel. To search a predictor macroblock with a non integer vector, first the predictor macroblock determined by the integer part of this vector is fetched, then this macroblock is submitted to so-called “half-pixel filtering”, which includes averaging the macroblock and the same macroblock shifted down and/or to the right by one pixel, depending on the integer or non-integer values of the two components of the vector. According to H.261 standards, the predictor macroblocks may be subjected to low-pass filtering. For this purpose, information is provided with the vector, indicating whether filtering has to be carried out or not.
The succession of types (intra, predicted, bidirectional) is assigned to the pictures in a predetermined way, in a so-called group of pictures (GOP). A GOP generally begins with an intra picture. It is usual, in a GOP, to have a periodical series, starting from the second picture, including several successive bidirectional pictures, followed by a predicted picture, for example of the form IBBPBBPBB . . . where I is an intra picture, B a bidirectional picture, and P a predicted picture. The processing of each bidirectional picture B is carried out from macroblocks of the previous intra or predicted picture and from macroblocks of the next predicted picture.
The various functional blocks that are used in a typical prior art functional implementation are shown in FIG. 2. For clarity, the motion estimation engine and memory for storing macroblocks and video pictures have been omitted.
In FIG. 2, a reference macroblock is supplied to a subtraction circuit, where the predictor for that macroblock is subtracted (in the case of B and P pictures, only). The resultant error block (or the original macroblock, for I pictures) is passed on to a DCT block then to a quantization block for quantization.
The quantized macroblock is forwarded to an encoding process and an inverse quantization block. The encoding process takes the quantized macroblock and zig-zag encodes it, performs run level coding on the resultant data, then variable length packs the result, outputting the now encoded bitstream.
The bitstream is monitored and can be controlled via feedback to a rate control system. This controls quantization (and dequantization) to meet certain objectives for bitstream. A typical objective is a maximum bit-rate, although other factors can also be used.
The inverse quantization block in FIG. 2 is the start of a reconstruction chain that is used to generate a reconstructed version of each frame, so that the frames the motion prediction engine is searching for matching macroblocks are the same as will be regenerated during decoding proper. After inverse quantization, the macroblock is inverse DCT transformed in IDCT block and added to the original predictor used to generate the error macroblock. This reconstructed block is stored in memory for subsequent use in the motion estimation process.
The various blocks required to generate the encoded output stream have different computational requirements, which themselves can vary according to the particular application or user selected restrictions. Throttling of the output bitstream to meet bandwidth requirements is typically handled by manipulating the quantization step.
Pure hardware architectures, while potentially the most efficient, suffer from lack of flexibility since they can support only a restricted range of standards; moreover they have long design/verification cycles. On the other hand, pure software solutions, while being the most flexible, require high-performance processors unsuited to low-cost consumer applications.
It would be desirable to provide an architecture that allowed for relatively flexible bitstream control while reducing the amount of software-based processing power required.