1. Field of the Invention
The present invention relates to a motion image coding apparatus. More specifically, the present invention relates to a motion image coding apparatus in which reference frame interval is adaptively controlled based on prediction efficiency.
2. Description of the Background Art
Recently, inter-frame predictive coding utilizing motion compensated inter-frame prediction in accordance with MPEG (Moving Picture Experts Group)-1 (ISO: International Organization for Standardization/IEC: International Electrotechnical Commission 11172) or MPEG-2 (ISO/IEC 13818) has come to be used in the fields of storage, communication, broadcasting and so on, as a method of motion image coding. In such a method, frames of moving image sequence is divided into coding blocks, a prediction block is generated using a motion vector detected from a reference frame for each coding block, and motion compensated inter-frame prediction is performed.
Coding mode for coding blocks in MPEG includes forward prediction mode in which prediction based on a reference frame in the past is used, a backward prediction mode in which prediction based on a future reference frame is used, bidirectional prediction mode in which a mean value of predictions based on a reference frame in the past and a reference frame in the future is used, and intra-frame coding mode in which prediction is not used.
Further, in accordance with MPEG, each frame is classified as an intra-frame coding frame which is coded only in the intra-frame coding mode, a forward prediction coding frame which is coded using the forward prediction mode or the intra-frame coding mode, or a bidirectional prediction coding frame which is coded using the forward prediction mode, the backward prediction mode, the bidirectional prediction mode or the intra-frame coding mode, and the classified frame is coded.
Here, the intra-frame coding frame is referred to as an I picture, the forward prediction coding frame is referred to as P picture, and the bidirectional prediction coding frame is referred to as a B picture. FIG. 1 shows prediction structure of I, P and B pictures.
In a sequence of motion images, first, an I picture 21 is coded. I picture can be decoded simply by the coding data of itself. Thereafter, I picture 21 is used as a reference frame and by forward prediction from I picture 21, a P picture 23 is coded. Thereafter, B picture 21 is coded utilizing any of forward prediction, backward prediction and bidirectional prediction, or not utilizing prediction at all, with I picture 21 serving as a reference frame in the past and P picture 23 serving as a reference frame in the future. Following coding of B picture 22, subsequent P picture 25 is coded using P picture 23 as a reference frame, or not using prediction at all. After coding of P picture 25, B picture 24 is coded using P pictures 23 and 25 as reference frames, or not using prediction at all.
From the foregoing, it can be seen that the order of input of images is I picture 21, B picture 22, P picture 23, B picture 24 and P picture 25 while the order of coding is I picture 21, P picture 23, B picture 22, P picture 25 and B picture 24, and hence it is necessary to encode with the order of images rearranged.
Referring to FIG. 2A, when a time interval between reference frames (hereinafter referred to as a reference frame interval) is assumed to be 1, then the order of input of images is I picture, P picture, P picture, P picture, P picture, P picture, and P picture. At this time, the order of coding is the same.
Referring to FIG. 2B, when the reference frame interval is 2, the order of input of images is B picture, I picture, B picture, P picture, B picture, P picture, B picture and P picture. By contrast, the order of coding is I picture, B picture, P picture, B picture, P picture, B picture, P picture, and B picture.
Referring to FIG. 2C, when the reference frame interval is 3, the order of input of images is B picture, B picture, I picture, B picture, B picture, P picture, B picture, B picture, and P picture. By contrast, the order of coding is I picture B picture, B picture, P picture, B picture, B picture, P picture, B picture, and B picture.
Referring to FIG. 3, the coding apparatus for the conventional motion image coding described above includes a frame memory 1, an image rearrangement control circuit 28, a motion vector detecting circuit 3, a coding mode determining circuit 4, a motion compensation predicting circuit 5, a selector 6, a subtractor 7, an encoder 8, a decoder 9, an adder 10 and a frame memory 11.
Frame memory 1 temporarily stores images for rearranging the order or sequence of the images. Image rearrangement control circuit 28 is connected to frame memory 1 and controls frame memory 1. Motion vector detecting circuit 3 is connected to frame memory 1 and detects a motion vector for motion compensated inter-frame prediction. Coding mode determining circuit 4 determines coding mode from the information calculated by motion vector detecting circuit 3. Motion compensation predicting circuit 5 is connected to motion vector detecting circuit 3 and coding mode determining circuit 4, and generates a prediction block. Selector 6 receives an output from coding mode determining circuit 4 as a selection signal, and receives an output of motion compensation predicting circuit 5 and `0` as input signals.
Subtractor 7 is connected to outputs of frame memory 1 and selector 6, and calculates a difference block which is a difference between a coding block and a prediction block. Encoder 8 is connected to an output of subtractor 7 and encodes the difference block. Decoder 9 is connected to an output of encoder 8, and decodes encoded data. Adder 10 is connected to outputs of decoder 9 and selector 6, and generates a decoding block by adding the decoded difference block and prediction block. Frame memory 11 is connected to an output of adder 10, and stores a decoded reference frame consisting of the decoding block.
Operation of the coding apparatus will be described in the following. Input images are once written in frame memory 1 in the order of input, and read in the order of coding described above, by image rearrangement control circuit 28. Pixel data of encoding block read from frame memory 1 are supplied to motion vector detecting circuit 3.
Motion vector detecting circuit 3 reads data of the reference frame from frame memory 11, performs block matching calculation with the coding block, and detects motion vector. At this time, in motion vector detecting circuit 3, prediction error in motion compensated inter-frame prediction and complexity of images (hereinafter referred to as "activity" in the specification) of the coding block are calculated and supplied to coding mode determining circuit 4.
Coding mode determining circuit 4 determines coding mode of the coding block of interest, using information such as the prediction error and the activity output from motion vector detecting circuit 3. As the prediction mode, one suffering from smallest prediction error is selected. As to whether intra-frame coding is used or not, it is determined based on comparison of magnitude in prediction error and activity. When prediction error is small, inter-frame prediction is selected and if the activity is small, intra-frame coding is selected.
Motion compensation predicting circuit 5 generates a prediction block using pixel data of the reference frame read from frame memory 11, in accordance with the prediction mode determined by coding mode determining circuit 4.
Selector 6 switches outputs in accordance with the prediction mode determined by the coding mode determining circuit 4. Here, in intra-frame coding mode, `0` is selected and otherwise, an output (prediction block) of motion compensation predicting circuit 5 is selected. Here, `0` indicates that neither a block to be subtracted nor a block to be added exist in subtractor 7 and adder 10, respectively.
A difference block between coding block and prediction block is calculated by subtractor 7. The difference block is coded by encoder 8, and coded data is output.
I picture and P picture are used for prediction of subsequent frames, as reference frames. Therefore, coded data of the I picture and the P picture are decoded by decoder 9, and the decoded difference block is added to the prediction block in adder 10. An output (decoding block) of adder 10 is stored in frame memory 11.
As described above, coded data of I and P pictures are decoded and used for prediction of subsequent frames. Thereafter, coding error generated in I and P pictures are propagated along time axis through P picture. By contrast, B picture is not used for prediction of other frames. Therefore, coding error generated in the P picture is not propagated.
The inter-frame prediction of the P picture is based only on the past, while inter-frame prediction of the B picture is based both on the past and the future. Therefore, generally, prediction error is smaller in B picture than P picture, and the amount of coding data generated is smaller.
Utilizing the nature described above, smaller amount of information is allocated to B picture from which coding error is not propagated and not much coding data is generated, while larger amount of information is allocated to I and P pictures from which coding error propagates. Consequently, I and P pictures come to have higher image quality, and prediction error of B picture using I and P pictures for prediction is reduced. Since the prediction error of the B picture is reduced and amount of information necessary for coding the B picture is reduced, amount of information to be allocated to I and P pictures is increased. As the amount of information allocated to I and P picture is increased, I and P pictures come to have ever higher image quality. In this manner, the motion image coding enters a virtuous circle, enabling enhancement of image quality of the entire sequence.
However, if I and P pictures are poor in quality, prediction error of B picture increases, requiring larger amount of information for coding the B picture. As larger amount of information is required for coding the B picture, the amount of information to be allocated to I and P pictures is reduced, lowering image quality of I and P pictures. As the image quality of I and P pictures lowers, the prediction error of B picture increases. In this manner, motion image coding enters a vicious circle, considerably degrading image quality of the entire sequence.
Here, the amount of prediction error in motion compensated inter-frame prediction depends on distance in time between the coding frame and reference frame, area of search of the motion vector, amount of movement of an object and so on.
The method of detecting motion vector for motion compensated inter-frame prediction will be described with reference to FIG. 4. Block matching method is generally known as a method of detecting motion vector. In block matching method, for a prediction block candidate in an area of motion vector search, an amount of error between the prediction block candidate and the coding block is calculated. A candidate of which error amount is the smallest is considered the prediction block, and relative amount of offset of the prediction block position from coding block position is assumed to be the motion vector.
The farther the coding frame and the reference frame are away from each other along the time axis, the larger becomes the amount of movement of the object, and therefore larger area of motion vector search is necessary.
Referring to FIG. 5, assume that the area of search of the motion vector necessary when the coding frame and reference frame are away from each other by one frame time is .+-.K in the horizontal direction and .+-.L in the vertical direction. Here, if the coding frame and the reference frame are away from each other by 2 frame time, necessary area of search of the motion vector is .+-.2K in the horizontal direction and .+-.2L in the vertical direction.
If the coding frame and reference are further away in time, movement of the object would involve complex components such as rotation and deformation, not only translation, which makes prediction difficult.
Therefore, generally, when the coding frame and reference frame are farther away in time, prediction error is increased. More specifically, the larger the reference frame interval, the larger the prediction error in coding the P picture. On the other hand, when the reference frame interval is larger and the number of frames of the B picture existing between the I and P pictures increases, the ratio of B picture to which small amount of information is allocated increases. Accordingly, the number of bits to be allocated to the I and P pictures is increased, contributing to enhanced image quality of I and P pictures.
From the foregoing, it can be seen that there is an optimal value of reference frame interval for each image sequence.
Japanese Patent Laying-Open No. 8-65678 entitled "Moving Image Encoding System" discloses a method of optimizing reference frame interval m and the number n of P pictures GOP by GOP (Group of Pictures) each consisting of N frames. Here, in the GOP, there is one I picture, and values N, m and n satisfy the following equation (1). EQU N=m(n+1) (1)
Referring to FIG. 6, an apparatus performing the process disclosed in Japanese Patent Laying-Open No. 8-65678 includes a frame memory 1, an image rearrangement control circuit 2, a motion vector detecting circuit 3, a coding mode determining circuit 4, a motion compensation predicting circuit 5, a selector 6, a subtractor 7, an encoder 8, a decoder 9, an adder 10, a frame memory 11, a correlation calculating circuit 26 and a prediction structure determining circuit 27. Portions corresponding to those of the conventional coding apparatus described with reference to FIG. 3 are denoted by the same reference characters. Names and functions are the same and therefore description thereof is not repeated.
Correlation calculating circuit 26 is connected to outputs of frame memory 1 and motion compensation predicting circuit 5, and calculates correlation coefficient between the coding block and the prediction block. Prediction structure determining circuit 27 determines the aforementioned reference frame interval m and the number n of P pictures, based on the correlation coefficient calculated by correlation calculating circuit 26.
Correlation calculating circuit 26 calculates the correlation coefficient .rho. represented by the following equation (2). ##EQU1## where x(s) represents pixel value of coding image in sth frame, and x(s-1) represents pixel value of coding image of the s-1th frame. E[.multidot.] represents an operation for calculating a mean value.
Prediction structure determining circuit 27 calculates coding efficiency Gain represented by the equation (3) for every possible combination of (m, n) in the number N of frames of GOP, and finds that set of (m, n) which provides the maximum Gain. ##EQU2## where w.sub.P and W.sub.B are constants and S(m-1) is given by the following equation (4). ##EQU3##
Image rearrangement control circuit 2 determines positions of I, P and B pictures in accordance with the combination (m, n) determined by prediction structure determining circuit 27, and reads coding frame from frame memory 1.
In the apparatus disclosed in Japanese Patent Laying-Open No. 8-65678, however, complicated calculations as represented by the equations (2), (3) and (4) are indispensable. These include a plurality of multiplications, divisions and power calculation. Therefore, for implementation, large scale operating circuit and long time of calculation are necessary. Further, equation (2) represents calculation pixel by pixel, which involves formidable amount of processing and, as it includes two multiplications, significantly large circuit scale is necessary.
Calculation of (m, n) is performed GOP by GOP, and therefore when the correlation coefficient given by the equation (2) changes abruptly because of abrupt movement of the object or camera at the time of real time coding, change in prediction structure may not follow the change in image, possibly resulting in degraded image quality.