1. Field of the Invention
The present invention relates to a moving picture coding method, and to a moving picture coding method appropriate to a moving picture coding recording and reproducing apparatus.
2. Description of the Prior Art
Numerous technologies have been developed in recent years for processing digitized moving pictures, including various methods of compressing and thereby reducing the amount of image data. Digital image compression is needed for efficient recording and transmission of digital moving pictures because of the inherently large amount of information resulting from the digitizing process. The extremely large volume of information recorded on digital video disks storing movies and similar subject matter, for example, necessitates efficient moving picture coding and decoding technologies.
One known moving picture coding method uses time base correlations to compress the image data, and may use specific correlations within or between image frames. Compared with intra-frame coding, coding using inter-frame correlations can achieve a higher compression rate, but when errors occur, a refresh operation for intra-frame coding at a predetermined cycle must be executed because the data is propagated on the time base.
FIG. 1 shows the relationship between predictive frames and processed frames in a method compressing moving pictures using time base correlations. There are three different kinds of frames, which are Intra frames I (I-picture frame or frame I), bidirectional interpolated frames B (B-picture frame or frame B) and predictive frames P (P-picture frame or frame P), which are produced in a pattern I,B,B,P,B,B, repeatedly, as shown in FIG. 1. The frame I includes one complete data for one frame and can reproduce one frame picture by itself. The frame B includes data for one frame, but cannot reproduce one frame picture by itself, but references the data from frame I and frame P. Similarly, the frame P includes data for one frame, but cannot reproduce one frame picture by itself, but references the data from frame P. Frames I are compressed using only the data correlations within that frame, and are used as the predictive frames when processing frames P and B. Frames P are coded by referencing the preceding frame I or the preceding frame P as the predictive frame. Frames B are coded by referencing, as the predictive frames, frames I and P before and after frame B, and the interpolated image generated from said frames I and P. The numbers following each coding type (I, B, P) indicate the display sequence of each frame in the chronological order. For example, frame I-0 is a type I frame in display sequence 0.
FIG. 2 shows the frames shown in FIG. 1 rearranged in the order processed. Frame coding starts with frame I-0, followed by frame P-3 using the already-coded frame I-0 as the predictive frame. Frames B-2 and B-1 are then frame interpolation coded using as the predictive frames the already-coded frames I-6 and P-3, and an interpolated image generated from frames I-0 and P-3. Frame I-6 is then coded by intra-frame coding, and frames B-4 and B-5 are frame interpolation coded using as the predictive frames the already-coded frames P-3 and I-6, and the interpolated image generated from frames P-3 and I-6. The coding sequence of each frame is thus determined based on the type of coding applied. Frame groups of plural frames are also defined as the processing unit for the coding operations, and are the data unit accessed for editing and special reproduction modes. In FIGS. 1 and 2, the first group of pictures GOP1 comprises frames 0-3, and the second GOP2 comprises frames 4-9; each of the subsequent GOPs similarly comprises six frames.
FIG. 3A shows the target coding level assigned to each GOP when coding each GOP in the sequence shown in FIG. 2. In a system manipulating moving pictures, it is necessary to keep the compression rate of the coded data to less than the maximum data transfer rate of the recording medium. In addition, if the recording time for a given recordable capacity is uniformly determined when recording images to a recording medium, it is also necessary to control the average transfer rate of the coded data. One means of controlling the coding rate is to maintain a constant average coding rate in each GOP. Specifically, as shown in FIG. 3A, a constant data amount is allocated to all GOPs in the input image.
To control the coding rate in the GOP, it is possible, for example, to provide a buffer at the output stage of the coding means, control the quantization pitches according to the amount of untransferred code in the buffer, and thereby control the data amount. Virtual buffers of differing capacities may also be created according to the type of the coding method in each frame, and the data amount controlled by these virtual buffers. This is because the amount of code required to obtain the same image quality increases according to the coding type (i.e., I&gt;P&gt;B), and this can therefore be used to change the size of the virtual buffers according to the type of frame processing, thereby preventing the buffer capacity from being too large or too small.
FIG. 4 is a block diagram of a conventional moving picture coding apparatus. As shown in FIG. 4, this moving picture coding apparatus comprises an image input means 400 for receiving image signal such as shown in FIG. 1 (frames shown in FIG. 1 are not yet compressed); a frame locator 401 for setting a pattern of alignment of I, P and B frames and setting the number of frames to be included in each GOP; a frame processing sequence controller 402 for changing the sequence of the frames according to the frame location information output from the frame locator 401 and for producing image signal such as shown in FIG. 2; a frame memory 412 having a capacity for storing at least two frames for temporarily holding the frame data that has to be shifted to a position in later sequence; a motion vector detector 403 for detecting in macro block units (e.g., 16.times.16 pixels) the motion vector between the newly received frame for processing (the "process frame") and the already-processed frame(s) used as the predictive frame of the process frame; a subtracter 404 for obtaining the difference between each block of the process frame and the predictive frame; a discrete cosine transform (DCT) unit 405 for applying a discrete cosine transformation (one type of orthogonal transformation) to the difference values; a quantizer 406 for quantizing the conversion coefficient obtained by the DCT unit 405, and also for producing a compressed frame signal; a coding unit 407 for coding (e.g., by Muffman coding) the compressed frame signal; a data amount controller 408 for controlling the coding of each frame in accordance with the frame type (I, P, B) data as obtained from the frame locator 401, and the quantization pitches based on the data amount generated by the coding unit 407 to maintain the instantaneous coded amount to be less than a predetermined maximum level; a dequantizer 409 and a DCT inverter 410 for temporarily expanding the compressed frame signal; and a frame memory 411 for storing the expanded frames.
The operation of the moving picture coding apparatus shown in FIG. 4 when coding with the frame coding sequence shown in FIG. 1 is described in detail below.
First, the frame processing sequence controller 402 passes frame I-0 to the DCT unit 405. Because frame I-0 is intra-frame coded, the motion vector detector 403 and subtracter 404 do not operate. The DCT unit 405 applies DCT processing by block unit (e.g., 8.times.8 pixels) to convert the signal to 8.times.8 matrix data expressed in the frequency domain. The quantizer 406 then quantizes the 8.times.8 matrix data using quantization tables such as shown in FIG. 5 for quantizing each value in the 8.times.8 matrix block unit. For example, using the table shown in FIG. 5, the first row first column data in the 8.times.8 matrix block unit is quantized as 0 when the data is between 0 and 7; as 1 when the data is between 8 and 15; as 2 when the data is between 16 and 17, and so on. In other words, the first row first column data is divided by 8 and rounded to obtain the quantized value.
In general, there is a strong correlation between adjacent image frames, and the energy is therefore concentrated in the low frequency component. Therefore, as shown in FIG. 5, the quantization pitches are made small in the low frequency components and larger in the high frequency components. The quantized coefficients are entropy coded by the coding unit 407, and the coded data is output from the coding unit 407. The data amount controller 408 controls the quantization pitches based on the frame type (I, P or B) and the instantaneous amount of coded data generated by the coding unit 407, and applies this information to the quantizer 406. For example, when the frame type is frame I, table shown in FIG. 5 is used, but when the frame type is frame P or frame B, the table shown in FIG. 6 is used. The target data amount of each frame in this case is calculated from the targeted average coding rate of the overall image, the number of frames in the GOP, and the coding method of the frames in the GOP. For example, the target data amount of frame I can be obtained from equation (1) EQU Rate*(FrameGOP/FrameRate)*(I.sub.-- ratio/(I.sub.-- ratio*I.sub.-- frame+P.sub.-- ratio*P.sub.-- frame+B.sub.-- ratio*I.sub.-- Frame))(1)
where Rate (bit/second) is the coding rate of the overall image; FrameRate (frames/second) is the frame rate; FrameGOP is the number of frames in each GOP; I.sub.-- frame is the number of I frames in each GOP; P.sub.-- frame is the number of P frames in each GOP; B.sub.-- frame is the number of B frames in each GOP; and I.sub.-- ratio:P.sub.-- ratio:B.sub.-- ratio is the ratio of I, P, and B frame code.
The target data amount of frames P and B can be similarly obtained from equations (2) and (3). EQU Rate*(FrameGOP/FrameRate)*(P.sub.-- ratio/(I.sub.-- ratio*I.sub.-- frame+P.sub.-- ratio*P.sub.-- frame+B.sub.-- ratio*I.sub.-- Frame))(2) EQU Rate*(FrameGOP/FrameRate)*(B.sub.-- ratio/(I.sub.-- ratio*I.sub.-- frame+P.sub.-- ratio*P.sub.-- frame+B.sub.-- ratio*I.sub.-- Frame))(3)
The data amount ratio (I.sub.-- ratio:P.sub.-- ratio:B.sub.-- ratio) of each of I, P, and B frames may be controlled constantly for the entire input image, or the generated code ratio may be inherited by calculation based on the data amount of each frame generated during actual coding of the previous GOP (e.g., the preceding group on the time base).
The quantizer 406 calculates a new quantizer from the quantization table shown in FIG. 5 and the quantization pitches applied from the data amount controller 408, and quantizes each block. FIG. 6 shows the quantizer newly calculated by the quantizer 406 when the quantization pitch output from the data amount controller 408 is doubled.
Each compressed frame signal is expanded by dequantizer 409 and DCT inverter 410. More specifically, block coefficient quantized by the quantizer 406 is then dequantized by the dequantizer 409, the DCT operation is reversed by the DCT inverter 410, and the data is buffered to the frame memory 411 capable of storing two frame data. The buffered frame is used as the predictive frame in the next processing operation.
Frame P-3 is then read by the frame processing sequence controller 402. Because frame P-3 is a predictive frame using frame I-0 for the prediction, frame I-0 is read from the frame memory 411, and the motion vector between frame I-0 and frame P-3 is calculated by macro block unit (e.g., 16.times.16 pixels). Motion compensation is accomplished by the subtracter 404 using the motion vector calculated for each macro block, and the difference values between macro blocks are obtained. The DCT unit 405 applies DCT to each block, and the coding unit 407 applies entropy coding. The data amount controller 408 controls the quantization pitches based on the frame type, the target data amount calculated from equation (2), and the instantaneous amount of code generated by the coding unit 407, and applies this information to the quantizer 406. Each block coefficient quantized by the quantizer 406 is then dequantized by the dequantizer 409, the DCT operation is reversed by the DCT inverter 410, the motion vector between frame P-3 and frame I-0 is referenced and added to the block unit, and the data is buffered to the frame memory 411. The buffered frame is used as the predictive frame in the next processing operation. Coding thus proceeds according to the information from the frame locator 401.
With the conventional coding method thus described, however, there are problems as follows.
(1) The image quality may differ greatly between scenes containing significant motion or scene changes and scenes with relatively little motion or change, and there will also be severe image deterioration at scene changes, because the data amount assigned to each GOP is constant irrespective of the scene content.
(2) Image quality within single GOPs may also vary because the data amount assigned to each frame in the GOP is fixed or calculated from the data amount ratio between the present and previous GOPs.