This invention relates to a coding apparatus and method, a program and a recording medium, and more particularly to a coding apparatus and method, a program and a recording medium suitable for use where rate control is performed in order to prevent a breakdown of a VBV buffer.
Various compression coding methods have been proposed for compressing video data and audio data to decrease the information amount. A representative one of the compressing coding methods is the MPEG2 (Moving Picture Experts Group Phase 2). When quantization control of the feedback type is performed in the MPEG2, usually a Q scale used for coding of the ,,th frame is used to determine an optimum step size for coding of the (,,+1)th frame.
In the conventional quantization control method, however, if an image 2 having a high degree of global complexity in coding follows another image 1 having a low degree of global complexity in coding as seen in FIG. 1, then a small Q scale of the image 1 which has a low degree of global complexity in coding and is easy to encode is used to start encoding of the succeeding image 2 having a high degree of global complexity in coding. This gives rise to a phenomenon that, upon encoding, a given bit amount is used up at an intermediate portion of the image 2 and encoding of the image 2 is performed correctly only up to the intermediate portion of the image 2.
For example, in the MPEG 2, a method called low delay coding wherein the delay time is reduced to less than 150 [ms] is prepared. In the low delay coding, neither B pictures which cause a reordering delay nor I pictures from which a large amount of codes are generated are used, but only P pictures are used. Further, a P picture is delineated into an intraslice which includes several slices and an interslice which includes all of the remaining slices so that it can be coded without re-ordering.
For example, where the image 1 and the image 2 of FIG. 1 are low delay coded, if the image 2 having a high degree of global complexity in image coding is coded next to the image 1 having a low degree of global complexity in image coding, then encoding of the image 2 having a high degree of global complexity in coding is started with a comparatively small Q scale. Therefore, a phenomenon appears that the preceding picture remains at some slice or slices at a lower end of the screen of the image 2. This phenomenon has an influence until an intraslice appears at a place where a similar problem appears subsequently.
In order to solve the subject just described, a coding apparatus and a coding method have been proposed wherein coded data with which an image of a high picture quality can be reproduced on the decoder side can be generated in a low delay mode as disclosed, for example, in Japanese Patent Laid-Open No. Hei 11-205803 (hereinafter referred to as Patent Document 1).
In particular, in order to perform quantization control of an ordinary feedback type to determine an optimum quantization step size for each of an intraslice and an interslice to perform quantization control, a scene change wherein a succeeding picture has a pattern much different from that of a preceding picture is detected. If a scene change is detected, then not a quantization index data Q(j+1) calculated based on the preceding picture is used, but an initial buffer capacity d(0) of a virtual buffer is updated based on ME residual information of the picture to be coded next so that the quantization index data Q(j+1) is re-calculated newly. Consequently, even if a scene change occurs, an optimum quantization step size is determined for each of an intraslice and an interslice and used for quantization control.
The ME residual is calculated in a unit of a picture and is a total value of difference values of the luminance between a preceding picture and a succeeding picture. Accordingly, when the ME residual information exhibits a high value, this represents that the pattern of the preceding picture and the pattern of the picture to be coded next are much different from each other, that is, a scene change.
The coding method is described below with reference to FIG. 2.
At step S1, ME residual information obtained, for example, when a motion vector is detected is acquired. The ME residual information acquired is represented by ME_info.
At step S2, an average value avg of ME residual information is subtracted from the acquired ME residual information, and it is discriminated whether or not the resulting difference value is higher than a predetermined threshold value D. The average value avg of the ME residual information is a value updated at step S4 hereinafter described and is given by the following expression (1):avg=1/2(avg+ME—info)  (1)
If it is discriminated at step S2 that the calculated difference value is equal to or lower than the predetermined threshold value D, then since it is discriminated that the pattern of the current picture and the pattern of the immediately preceding picture is not significant, that is, no scene change has occurred, the processing advances to step S4.
On the other hand, if it is discriminated at step S2 that the calculated difference value is higher than the predetermined threshold value D, then it is discriminated that the difference between the pattern of the current picture and the pattern of the preceding picture is significant, that is, a scene change has occurred. Therefore, at step S3, an initial buffer capacity d(0) of a virtual buffer is calculated based on expressions (2), (3), (4) and (5) given below to update the virtual buffer.
X which represents the global complexity (GC) of a picture unit is given by the following expression (2):X=T×Q  (2)where T is the generated code amount of the picture unit, and Q is the average value of the quantization step sizes of the picture unit.
Then, if it is assumed that the global complexity X of the image of the picture unit is equal to the ME residual information ME_info, that is, when the following expression (3) is satisfied, the quantized index data Q of the entire picture is given by the expression (4):X=ME_info  (3)Q={d(0)×31}/{2×(br/pr)}  (4)where br is the bit rate, and pr is the picture rate.
Further, the initial buffer capacity d(0) of the virtual buffer in the expression (4) is given by the following expression (5):d(0)=2×{(ME—info×br/pr)/31×T}  (5)
The initial buffer capacity d(0) of the virtual buffer is substituted back into the expression (4) to calculate the quantized index data Q of the entire picture.
When it is discriminated at step S2 that the calculated difference value is equal to or lower than the predetermined threshold value D or after the process at step S3 comes to an end, the average value avg of the ME residual information is calculated and updated in accordance with the expression (1) given hereinabove at step S4 in preparation for a picture to be supplied next. Thereafter, the processing returns to step S1 so that the processes described hereinabove are repeated.
If a scene change wherein a succeeding picture has a pattern much different from that of a succeeding picture is detected through the process described above with reference to the flow chart of FIG. 2, then the initial buffer capacity d(0) of the virtual buffer is updated with the ME residual information ME_info of the picture to be coded next. Then, the quantized index data Q(j+1) is calculated newly based on the updated value of the initial buffer capacity d(0). Consequently, an optimum quantization step size is determined for each of an intraslice and an interslice in response to a scene change.
A variation of the virtual buffer capacity between a macro block at the first coding position and another macro block at the last coding position of different pictures where the process described above with reference to FIG. 2 is executed is described with reference to FIG. 3. It is assumed that, among pictures 21 to 25 of FIG. 3, a left side one is a picture coded prior in time. Also it is assumed that each of the pictures 21 to 25 is formed from n+1 macro blocks.
For example, if the pictures 21 and 22 have patterns much different from each other, or in other words, if a scene change occurs between the pictures 21 and 22, then the process described hereinabove with reference to FIG. 2 is executed upon coding of the picture 22. Accordingly, the virtual buffer capacity d1_0 at the top coding position of the picture 22 is set to a value higher than that of the virtual buffer capacity d0_n at the last coding position of the picture 21. Consequently, upon coding of the picture 22, the situation that a given bit amount is used up at an intermediate portion of the screen can be prevented.
Then, if no scene change is detected with regard to the pictures 23 to 25, then the virtual buffer capacity d2—0 at the top coding position of the picture 23 has a value proximate to the virtual buffer capacity d1_n at the last coding position of the picture 22; the virtual buffer capacity d3—0 at the top coding position of the picture 24 has a value proximate to the virtual buffer capacity d2_n at the last coding position of the picture 23; and the virtual buffer capacity d4—0 at the top coding position of the picture 25 has a value proximate to the virtual buffer capacity d3_n at the last coding position of the picture 24.
In this manner, in rate control which is used popularly, the virtual buffer capacity, that is, the quantization value, is determined through feedback in a unit of a macro block. Therefore, when the quantization value is changed to a high value at a scene change, coding of succeeding pictures is performed using an unnecessarily high quantization value although the pattern does not exhibit a significant change as at a scene change until after the quantization value is settled to a value appropriate to the pattern through feedback. This significantly deteriorates the picture quality of several pictures after a scene change.
Further, not only in the low delay coding, but also in a coding process by some other method, in order to prevent a breakdown of the VBV buffer caused by an increase of the generated code amount, for example, by a scene change, such control as to increase the quantization value is performed. Also in this instance, an unnecessarily high quantization value is used for coding for a period of time until after the quantization value is settled to a value appropriate for the pattern through feedback, and this gives rise to deterioration of the picture quality.
Further, in the MPEG-2 TM5 (Test Model 5), since rate control is performed in accordance with the picture type, the value of a virtual buffer for a preceding picture of the same picture type is used as an initial value for a virtual buffer for a picture of an object of coding. Accordingly, images to which the TM5 is applied suffer from deterioration of the picture quality similarly with regard to a next picture of the same picture type to a picture having a quantization value increased as a countermeasure for a scene change.