1. Field of the Invention
This invention relates to a method and apparatus for picture compression which may be used with advantage for compressing a picture.
2. Description of the Related Art
FIG. 1 shows a conventional arrangement of a picture compression apparatus employed with advantage for encoding a picture for compression.
In the picture compression apparatus, shown in FIG. 1, picture data digitized with the following numbers of pixels are supplied to an input terminal 1, as shown in FIG. 2: luminance components (Y): 352(H)xc3x97240(V)xc3x9730 frames chroma components (Cb): 176(H)xc3x97120(V)xc3x9730 frames chroma components (Cr): 176(H)xc3x97120(V)xc3x9730 frames
The input picture data, supplied to the input terminal 1, are sent to a motion detector 20 and a block divider 11 via a frame memory 10 designed to store the input picture data transiently and to interchange the data sequence appropriately.
The block divider 11 divides each frame supplied from the frame memory 10 into blocks of 8xc3x978 pixels for luminance components Y and chroma components Cr, Cb, as shown in FIG. 3. The four blocks Y0, Y1, Y2 and Y3 of luminance components Y, one block of chroma components Cb and one block Cr of chroma components, totalling six blocks, are termed a macro-block MB.
The macro-block based data from the block divider 11 is sent to a subtractor 12.
The subtractor 12 finds a difference between data from the block divider 11 and inter-frame prediction picture data as later explained and transmits a difference output to a fixed contact b of a changeover switch 13 as frame data to be encoded by inter-frame predictive coding as later explained. To the opposite side fixed contact a of the changeover switch 13 are supplied data from the block divider 11 as frame data to be intra-coded as later explained.
The block-based data from the changeover switch 13 are discrete-cosine-transformed by a DCT circuit 14 to produce DCT coefficients which are then supplied to a quantizer 15. The quantizer 15 quantizes the DCT output at a pre-set quantization step width to produce quantized DCT coefficients (quantized coefficients) which are then supplied to a zigzag scan circuit 16.
The zigzag scan circuit 16 re-arrays the quantized coefficients by zig-zag scan as shown in FIG. 4. The resulting output is supplied to a variable length encoding circuit 17. The variable length encoding (VLC) circuit 17 variable length encodes the output data of the zigzag scan circuit 16 and sends the VLC output to an output buffer 18. The variable length encoding circuit 17 also sends the information specifying the code amount from the variable length,encoding circuit 17 to a quantization step controller 19. The quantization step controller 19 controls the quantization step width of the quantizer 15 based upon the information specifying the code amount from the variable length encoding circuit 17. Output data of the output buffer 18 is outputted at an output terminal 2 as compressed encoded data.
An output of the quantizer 15 is dequantized by a dequantizer 27 and further inverse discrete-cosine-transformed by an inverse DCT circuit 26. An output of the inverse DCT circuit 26 is supplied to an adder 25.
To the adder 25 is supplied inter-frame prediction picture data from a motion compensator 21 via a changeover switch 24 turned on for a frame for inter-frame predictive coding. The inter-frame prediction picture data is added by the adder 25 to output data of the inverse DCT circuit 26. Output data of the adder 25 is transiently stored in a frame memory 22 and thence supplied to the motion compensator 21.
The motion compensator 21 performs motion compensation based upon the motion vector detected by the motion detector 20 and outputs the resulting inter-frame prediction picture data.
An illustrative operation of the conventional picture compression apparatus shown in FIG. 1 is explained in detail. For explanation sake, the following appellations of the respective frames are used.
First, the frames arrayed in the display sequence are termed I0, B1, B2, P3, B4, B5, P6, B7, B8, I9, B10, B11, B12, . . . Of these frames, I, P and B specify the methods of compression, as later explained, and the numerical figures affixed to I, P and B simply specify the display sequence.
Of the Moving Picture Expert Group (MPEG), a work group for international standardization of the color moving picture encoding system, the MPEG1 provides the following for compressing the above pictures.
First, the picture I0 is compressed.
Next, in compressing the picture P3, difference data between P3 and I0 is compressed in place of the picture P3 itself.
Next, in compressing the picture B1, difference data between B1 and I0, difference data between B1 and P3 or difference data between B1 and a mean value of I0 and P3, whichever is smallest in information amount, is compressed in place of the picture B1 itself.
Next, in compressing the picture B2, difference data between B2 and I0, difference data between B2 and P3 or difference data between B2 and a mean value of I0 and P3, whichever is smallest in information amount, is compressed in place of the picture B2 itself.
Next, in compressing the picture P6, difference data between P6 and P3 is compressed in place of the picture P6 itself.
The following is the representation of the above processing
Thus the encoding sequence. is partially interchanged in sequence from the display sequence and becomes:
I0, P3, B1, B2, P6, B4, B5, P9, B7, B8, 19, P12, BI0, B11, The compressed data, that is encoded data, is arrayed in this sequence.
The reason the pictures are arrayed in this manner is now explained in connection with the operation of the arrangement shown in FIG. 1.
In encoding the first picture 10, data of a first picture to be encoded, supplied from the frame memory 10, is blocked by the block divider 11. The block divider 11 outputs block-based data in the sequence of Y0, Y1, Y2, Y3, Cb and Cr and transmits the data to the DCT circuit 14 via the changeover switch 13 the movable contact of which is set to the fixed contact a. The DCT circuit 14 executes two-dimensional DCT on the respective blocks for transforming the data from time axis to the frequency axis.
The DCT coefficients from the DCT circuit 4 are routed to the quantizer 15 so as to be quantized at a pre-set quantization step width. The quantized coefficients are then re-arrayed in a zig-zag order by the zing-zag scan circuit 16, as shown in FIG. 4. If the quantized coefficients are re-arrayed in the zig-zag order, the coefficients are arrayed in the order of increasing frequency so that the values of the coefficients become smaller in a direction proceeding towards the trailing end of the coefficient array. Therefore, if the coefficients are quantized with a given value S, the results of quantization tend to become zero towards the trailing end so that high-frequency components are cut off.
The quantized components are then sent to the variable length encoding circuit 17 where they are encoded by Huffman coding. The resulting compressed bitstream is transiently stored in the output buffer 18 from which it is transmitted at a constant bit rate. The output buffer 18 is a buffer memory for outputting an irregularly produced bitstream at a constant bit rate.
The above-described encoding for the picture by itself is termed intra-frame coding. The encoded picture is termed an I-picture.
If a decoder receives the bitstream for the I-picture, the above procedure is followed in the reverse order to complete the first picture.
The encoding for the second picture P3 is as follows:
The second and the following pictures may be encoded into bitstreams as I-pictures. However, for raising the compression ratio, the following method is us,ed by exploiting the fact that the contents of contiguous pictures exhibit correlation.
First, the motion detector 20 finds out, for each macro-block of the second picture, a pattern in the first picture I0 having similarity to the macro-block under consideration, and represents the pattern by a coordinate of the relative position termed a motion vector (x,y).
In the second picture, the block is not directly transmitted to the DCT circuit 14, as in the first picture described above. Instead, difference data between the block under consideration and a block referenced from the first picture depending on the motion vector for the block in consideration, is found by the subtractor 12 and thence supplied to the DCT circuit 14. The method of detecting the motion vector is discussed in detail in ISO/IEC 111172-2 annex D.6.2 and hence is not elucidated herein.
If strong correlation persists between the pattern of the first picture indicated by the motion vector and the pattern of the block being encoded, the difference data becomes small so that the volume of the compressed data becomes smaller on encoding the motion vector and the difference data than on encoding the pattern by intra-frame coding.
This compression method is termed an inter-frame predictive coding. However, a smaller amount of difference data does not necessarily lead to a reduced amount of compressed data. Thus there are occasions wherein, depending on the picture pattern, that is on the picture contents, intra-frame coding leads to a higher encoding efficiency than difference taking. In such case, the intra-frame coding is used for encoding. Which of the inter-frame coding and intra-frame coding is to be used is to be determined from macro-block to macro-block.
The above encoding procedure is explained in connection with the picture compression apparatus shown in FIG. 1. For executing inter-frame predictive coding, it is necessary to provide the same picture as that produced on a decoder on an encoder.
To this end, there is provided in the encoder a circuit which is the same as that provided in the decoder. This circuit is termed a local decoder. The dequantizer 27, inverse DCT circuit 26, adder 25, frame memory 22 and the motion compensator 21 make up the local decoder. The picture stored in the frame memory 22 is termed a locally decoded picture or locally decoded data.
Conversely, the pre-compression picture data is termed an original picture or original data.
Meanwhile, the first picture decoded by the local decoder is stored in the frame memory 22 during compression of the I-picture. Noteworthy is the fact that the picture produced by this local decoder is not the pre-compression picture but is a picture restored on compression, that is the picture deteriorated in picture quality by compression, or the same picture as that decoded by the decoder.
The original data of the second picture P3 enters the encoder under this condition. The motion vector must have been detected before this stage. The data has a motion vector from block to block. This vector is applied to the motion compensator 21. The motion compensator 21 outputs data of the locally decoded picture indicated by the motion vector, that is motion compensation data (MC data), for one macro-block, as the inter-frame prediction picture data.
The subtractor 12 finds a pixel-based difference between the original data of the second picture and the motion compensation data (inter-frame prediction picture data). The difference data is supplied to the DCT circuit 14. The subsequent compression method is basically the same as that for the I-picture. The picture compressed by the above-described compression method is termed a predicted picture or P-picture.
More specifically, all macro-blocks are not necessarily compressed by inter-frame encoding in a P-picture. If it is judged that intra-frame prediction is more efficient for a macro-block being encoded, the macro-block is encoded by intra-frame encoding.
That is, in the P-picture, one of the intra-frame encoding and the inter-frame encoding is selected form one macro-block to another, for compressing a given macro-block. The macro-block encoded by the intra-frame encoding or by the inter-frame encoding is termed an intra-macro-block or inter-macro-block, respectively.
In the local decoder, the output of the quantizer 15 is dequantized by the dequantizer 27 and inverse DCTed by the inverse DCT circuit 26 so as to be then summed to the motion compensation data (MC data) to provide an ultimate locally decoded picture.
The third picture B1 is encoded as follows:
In encoding the third picture B1, the motion vector for each of the two pictures I0 and P3 is searched. The motion vector for I0 and that for P3 are termed a forward vector Mvf(x,y) and a backward vector Mvb(x,y), respectively.
For this third picture, difference data is compressed. It is crucial which data is to be compressed. In this case, too, the picture which gives the smallest information amount is selected in taking the difference. There are four alternatives possible for the compression method, that is
(1) difference with data of the picture I0 indicated by the forward vector Mvf(x,y);
(2) difference with data of the picture P3 indicated by the backward vector Mvb(x,y);
(3) difference with mean values between the difference with data of the picture I0 indicated by the forward vector Mvf(x,y) and the difference with data of the picture P3 indicated by the backward vector Mvb(x,y); and
(4) no difference data is employed, that is intra-frame encoding is used. One of the four compression methods is selected on the macro-block basis. For the alternatives (1) to (3) of the compression method, the respective motion vectors are also supplied to the motion compensator 21 where the differences are found with respect to the motion vectors and supplied to the DCT circuit 14. For the alternative (4), the data is directly transmitted to the DCT circuit 14.
The above processing becomes possible since the two pictures I0 and P3 have been restored and present in the frame memory 22 adapted for storing the locally decoded pictures.
The fourth picture B2 is encoded as follows:
The fourth picture B2 is encoded in the same way as in the method for encoding the third picture except that the picture B1 reads B2.
The fifth picture P6 is encoded as follows:
The fifth picture P6 is encoded in the same way as in the method for encoding the second picture except that the pictures P3 and I0 read P6 and P3, respectively.
The encoding of the sixth picture and so forth is simply the repetition of the above processing and hence the corresponding description is omitted.
The MPEG also provides a group-of-pictures.
That is, a set of several pictures is termed a group-of-pictures (GOP) which must be a set of pictures contiguous to one another when seen as encoded data, that is compressed data. In addition, the GOP takes the random accessing into account and hence the picture which comes first in the GOP in the encoded data must be an I-picture, while the last picture in the GOP in the display sequence must be an I-picture or a P-picture.
FIGS. 5A and 5B show an example of GOPs including a GOP made up of four pictures followed by GOPs each made up of six pictures. FIGS. 5A and 5B show the display sequence and the sequence of the encoded data, respectively.
If attention is directed to the GOP 2 in FIGS. 5A and 5B, it is seen that, since B4, B5 are formed from P3, I6, while there is no P3, the pictures B4 and B5 cannot be decoded correctly on accessing I6 by random accessing. The GOP which cannot be correctly decoded within the GOP itself is termed a closed GOP.
Conversely, provided that B4 and B5 refers only to I6, P3 is unnecessary even if I6 is accessed by random accessing, so that B4, B5 can be decoded correctly. The GOP which can be correctly decoded within the GOP itself is termed an open GOP.
The compression method which gives the maximum encoding efficiency is selected from among the alternatives of the compression method. The ultimate amount of the encoded data also depends upon the input picture and can be comprehended only on compressing the data.
However, it is also necessary to manage control for providing a constant bit rate of the compressed data. The parameters used for such control include a quantization step or quantization scale (Q-scale). The larger or smaller the quantization step, the smaller or larger becomes the amount of generated bits for the same compression method, respectively.
The following is the manner of controlling the value of the quantization step.
For providing a constant bit rate of the compressed data, the encoder has an output buffer 18 adapted for absorbing the picture-based difference in the amount of generated data to a limited extent.
However, if data is produced in an amount exceeding the pre-set bit rate, the residual amount in the output buffer 18 is increased until overflow eventually occurs. Conversely, if data is produced in an amount lower than the pre-set bit rate, the residual amount in the output buffer 18 is decreased until underflow eventually occurs.
Thus the encoder feeds back the residual amount of the output buffer 18 for controlling the quantization step of the quantizer by the quantization step controller 19. Specifically, the encoder manages control for reducing the quantization step for avoiding excessive compression if the residual amount in the output buffer 18 is decreased. The encoder also manages control for increasing the quantization step for raising the compression ratio if the residual amount in the output buffer 18 is increased.
On the other hand, there is a significant difference in the range of the amount of the encoded data generated by the above-given compression methods, that is the intra-frame coding or the inter-frame coding.
In particular, if the intra-frame coding is employed for compression, a large amount of data is produced, so that, if the vacant capacity of the output buffer 18 is small, the quantization step size must be increased. As the case may be, the overflow of the buffer 18 may be incurred even with the maximum quantization step width. Granted that the data can be stored in the buffer 18, the intra-frame encoded picture produced with the larger quantization step affects the quality of the subsequently produced inter-frame coded picture. Thus, a sufficient vacant capacity must be provided in the output buffer 18 prior to proceeding to intra- frame coding.
Thus the compression methods of a pre-set sequence are set in advance and the quantizatior step controller 19 manages feedback control of the quantization step size for assuring b sufficient vacant capacity of the output buffer 18 prior to proceeding to intra-frame coding. This allows to suppress the encoded data to a pre-set rate.
Recently, a demand is raised for more efficient compression of the picture information. That is, it has recently been envisaged to achieve more efficient compression. by achieving the target bit rate as the deterioration caused by compress-ion is avoided as far as practicable by curtailing the information by taking into account the psychoacoustic mechanism of the visual sense of the human being, or by varying the compression ratio depending upon the amount of the information owned by the input information or the amount of the information owned by the picture pattern for the same picture. Also, a variety of algorithms have been devised for implementing such data compression.
However, notwithstanding. these endeavors, it occurs frequently that the satisfactory picture quality cannot be achieved depending upon the target bit rate or complexity of the input picture. For example, if the technique of varying the compression ratio depending upon the amount of information owned by the input picture is resorted to, data compression may occur in a manner contrary to the intention of the picture producer. That is, if the technique of varying the compression ratio depending upon the amount of. information owned by the input picture is resorted to, data compression may occur without regard to whether the picture portion is or is not crucial for the picture producer. In other words, it may occur that the picture portion crucial to the picture producer undergoes deterioration in picture quality while a large amount of bits are consumed for the picture portion not crucial to the picture producer. On the other hand, it is impossible for the compression apparats to take into account the intention of the picture producer automatically on the basis of the input picture.
It is therefore an object of the present invention to provide a method and apparatus for picture compression whereby compression may be achieved in a manner of reflecting the intention of the picture producer without deviating from the target bit rate.
In one aspect, the present, invention provides a picture compression apparatus having means for compressing input picture data, basic compression ratio setting means for setting the basic compression ratio in compressing the input picture data by said compression means, means for designating an optional area in the input picture, designated area importance setting means for setting the importance in compressing the input picture data corresponding to the area designated by the designation means, and compression ratio modifying means for modifying the basic compression ratio based upon the importance for the designated area as set by the designated area importance setting means.
In another aspect, the present invention provides a picture compressing method including the steps of setting a basic compression ratio in compressing input picture data, designating an optional area in an input picture, setting the importance in compressing the input picture data corresponding to the area designated by the designating step, modifying the basic compression ratio based upon the importance for the designated area as set by the designated area importance setting step, and compressing the input picture data using the compression ratio obtained by the compression ratio modifying step.
With the method and apparatus for picture compression according to the present invention, an optional area contemplated by the picture producer is designated in compressing input picture data. The picture in the designated area can be compressed by setting importance to be attached to the designated area. If importance to be attached to the designated area is raised, the area contemplated by the picture producer can be raised in picture quality. In addition, the post-compression bit rate may be accommodated within the target bit rate by modifying the basic compression ratio based upon the importance attached to the designated area.
According to the present invention, an optional area in an input picture is designated, and importance attached to the designated area is set, so that the picture in the designated area can be compressed with the desired importance. The post-compression bit rate can be accommodated within the target bit rate by modifying the basic compression ratio based upon the importance of the designated area.