1. Field of the Invention
This invention relates to the improvement of efficiency of picture coding of images formed by MPEG-2 video signals, and to digital broadcast service, Internet video distribution, and package media production.
2. Description of the Related Art
FIG. 9 is a process block diagram illustrating a conventional picture coding apparatus disclosed in, for example, JP-A-9-93537 official gazette.
In FIG. 9, reference numeral 81 designates a picture frame reordering process, 82 denotes a differencing process, 83 designates a discrete cosine transform (DCT) process, 84 denotes a quantization process, 85 designates an inverse quantization process, 86 denotes an inverse DCT (IDCT), 87 designates an addition process, 88 denotes a motion compensation process, 89 designates a variable length coding process, 90 denotes a buffering process, and 91 designates a coding control process.
Next, processing to be performed at the transmitting side of a conventional picture coding apparatus is described hereinbelow with reference to FIG. 9.
Picture frames each serving as the unit of coding in a picture signal 101 are reordered in coded order by performing the picture frame reordering process 81. Then, the reordered picture frames are outputted. In the motion compensation process 88, the motion compensation prediction of a coding object picture frame 103 is performed by using one or more coding picture frames. Thus, a motion vector 104 and a motion compensation predicted picture frame 105 are generated. In the differencing process 82, a prediction error picture frame 106 is generated by calculating the difference between the coding object picture frame 103 and the motion compensation predicted picture frame 105. In the DCT process 83, a DCT is performed on the prediction error picture frame 106, so that a set of transform coefficients is generated. In the quantization process 84, quantization is performed on the set of transform coefficients, so that a set of quantization indexes is generated. In the inverse quantization process 85, a set of transform coefficients is decoded from the set of quantization indexes. In the IDCT process 86, a prediction error picture frame 107 is decoded from the set of transform coefficients. In the addition process 87, a coded picture frame 108 is generated by adding the prediction error picture frame 107 to the motion compensation predicted picture frame 105. In the variable length coding process 89, the quantization indexes and the motion vector 104 are variable length coded, so that a coded bit string is generated. In the buffering process 90, the coded bit string is once stored. Then, a coded bit string 102 is outputted at a fixed bit rate. In the coding control process 91, the feedback control of the DCT process, the quantization process, and the variable length coding process is performed by monitoring the buffering state. Incidentally, in the case of the MPEG-2 video coding method, a series of pixel blocks (or macroblocks) in a picture frame is referred to as a “slice”. The control of the quantization is usually performed in slice units.
Next, interframe prediction comprising the motion compensation process 88 and the differencing process 82 is described hereunder. Picture frames in the picture coding according to MPEG-2 standard are classified into three types, namely, I-pictures, P-pictures, and B-pictures by the manner of performing the interframe prediction. The I-pictures are picture frames, each of which is coded therein without performing interframe prediction. The P-pictures are picture frames, each of which is interframe prediction coded by using a coded picture of a past picture frame. The B-pictures are picture frames, each of which is interframe prediction coded by using both of past and future picture frames. Therefore, in the case of coding I-pictures, the motion compensation process 88 and the differencing process 82 are omitted. Consequently, a coding object picture frame directly undergoes the DCT process 83.
Next, the picture frame reordering process 81 is described hereinbelow. FIG. 10 is a diagram illustrating the reordering of picture frames. This figure illustrates the comparison between the picture frame sequence in displaying order and the picture frame sequence in coded order. Moreover, this figure illustrates a coding mode corresponding to each of the picture frames (that is, corresponding to each of the three picture types, namely, the I-picture, P-pictures, and B-pictures). A sequence of picture frames arranged in displaying order is reordered by the picture frame reordering process 81 into a sequence of picture frames arranged in coded order. In the case of the picture coding according to MPEG-2 standard, a group-of-pictures (GOP) header can be inserted just before a coded bit string corresponding to an I-picture. In the coded bit string, one GOP consists of pictures included within a range from an I-picture placed just after the GOP header to a picture placed immediately before the next GOP header. That is, one GOP includes one or more I-pictures without exception. In the case of an example shown in FIG. 10, one GOP consists of 15 picture frames whose picture frame numbers range from (−1) to 13. Let M (frames) and N (frames) denote the frame interval between a P-picture and an I-picture or another P-picture, and the number of picture frames composing one GOP, respectively. In the case of FIG. 10, M=3, and N=15. Usually, the values of such M and N are fixed. In the aforementioned manner, the coding is performed by reordering the sequence of picture frames in coded order and by then carrying out the interframe prediction.
Further, FIG. 11 is a process block diagram illustrating a conventional picture coding apparatus disclosed in, for example, JP-A-10-313463 official gazette.
In FIG. 11, reference numeral 200 designates a motion vector detecting portion, 201 denotes a differential picture generating portion, 202 designates a unit division portion, 203 denotes an activity calculating portion, 204 denotes an average unit activity updating portion, 205 designates a target code amount determining portion, 206 denotes a coding portion, 207 designates an allotted code amount updating portion, and 208 denotes a local decoder.
Next, processing to be performed at the transmitting side of this picture coding apparatus is described hereinbelow with reference to FIG. 11.
As shown in FIG. 11, an input picture signal is inputted to both the motion vector detecting portion 200 and the differential picture generating portion 201. The motion vector detecting portion 200 outputs a motion vector according to the picture type of the input picture. That is, in the case that the input picture is a P-picture or B-picture, this portion performs motion vector detection and then outputs a motion vector. In the case that the input picture is an I-picture, this portion does not perform motion vector detection.
In the case that the input picture is a P-picture or B-picture, the differential picture generating portion 201 generates a prediction picture from both the inputted motion vector and a decoded reference picture, which is inputted from the local decoder 208. Subsequently, the portion 201 performs a differencing operation on the prediction picture and the input picture. Then, the portion 201 outputs a differential picture. This differential picture is inputted to the unit division portion 202, the activity calculating portion 203, and the coding portion 206. Incidentally, in the case that the input picture is an I-picture, the input picture itself is outputted from the differential picture generating portion 201, and inputted to the unit division portion 202, the activity calculating portion 203, and the coding portion 206.
The unit division portion 202 defines an I-unit, which consists of one I-picture and two B-pictures, and a P-unit that consists of one P-picture and two B-pictures. Further, this portion 202 determines according to the picture type of the inputted differential picture which of the I-unit and the P-unit the inputted differential picture belongs to. Further, the portion 202 divides the inputted differential images into the units. Then, the portion 202 outputs unit information on each of the units.
The activity calculating portion 203 performs an activity operation on the inputted differential picture, and then outputs a frame activity. The activity is a measure of complexity of a picture and easiness of coding.
This activity is inputted to the target code amount determining portion 205. Moreover, the activity calculating portion 203 outputs a unit activity of the unit, to which the differential picture belongs, from the unit information thereon. This unit activity is inputted to the average unit activity updating portion 204 and the target code amount determining portion 205.
The average unit activity updating portion 204 updates the average unit activity of the unit from the inputted unit activity.
The target code amount determining portion 205 outputs a target code amount corresponding to a coded frame according to the inputted frame activity, the unit activity, the average unit activity, and the allotted code amount.
The coding portion 206 performs coding on the inputted differential picture, based on the inputted target code amount. Then, the portion 206 outputs coded data. Subsequently, the coded data is inputted to the allotted code amount updating portion 207 and the local decoder 208.
The allotted code amount updating portion 207 calculates a generated code amount from the inputted coded data, and updates the allotted code amount.
The local decoder 208 performs decoding on the inputted coded data, and generates a decoded picture.
In this way, the conventional apparatus determines the degree of complexity of a to-be-coded picture frame in terms of the activity according to a result of preceding. Then, the conventional apparatus sets a target code amount corresponding to the unit of coding control so that the total amount of generated codes is within a desired code amount. Thus, the conventional apparatus controls quantization characteristics in coding.
The conventional picture coding apparatus shown in FIG. 9 performs the aforementioned operation, and also performs the feedback control of coding characteristics by monitoring the amount of codes for a picture frame and a slice, and the buffering state so that a coding bit rate is constant. Thus, in the case of instantaneously varying scenes and rapidly moving pictures, the time correlation therebetween (namely, between picture frames) and the spatial correlation therebetween (namely, in a picture frame, for example, between slices) are low. Thus, usually, an amount of actually generated codes largely exceeds an amount of codes, which is estimated before the coding. That is, in the case of coding instantaneously varying scenes and rapidly moving pictures, the feedback control often cannot follow the actual variation in the generated code amount and thus fails, with the result that the picture quality of coded pictures is deteriorated.
Further, in the case of the conventional picture coding apparatus illustrated in FIG. 11, the degree of complexity of picture frames to be coded is determined according to a result of preceding. Further, the target code amount corresponding to the unit of coding control is set so that the total amount of generated codes is within a desired code amount. Thus, the quantization characteristics in coding are controlled. Such a coding method is suitable for coding signals to be stored on storage media, such as a DVD. However, such a conventional control method is used for controlling mainly the quantization characteristics, so that change of coded picture quality is often visible on the change point of quantization characteristics.
Furthermore, the conventional picture coding apparatus does not control the coding mode (or picture type) and the size of GOP according to the characteristics of a picture to be coded, which are estimated from different aspects, for instance, the long-term variation in complexity of the picture, and the presence or absence of a scene change. It is, thus, difficult to achieve highly efficient coding by suppressing the variation in the coding quality within a restricted range of amounts of codes.
Additionally, a feedforward control method of performing the quantization by using all of available quantization characteristics and selecting the quantization characteristic corresponding to an amount of obtained codes, which is closest to a target code amount, is sometimes employed for controlling the quantization characteristics. However, in this case, the conventional apparatus has drawbacks in a very large amount of operation and a very large circuit size.