The present invention relates to a moving image signal coding apparatus and a moving image signal coding control method for encoding moving image signals using inter-frame prediction and intra-frame prediction.
In moving image signal coding utilizing inter-frame prediction, coding without inter-frame prediction, i.e., refreshing, should be occasionally performed so that degradation of quality of a reproduced image due to accumulation of prediction errors can be kept equal to or lower than a predetermined level. Generally, in recent moving image signal coding methods, the refreshing is performed by block unit as shown in CCITT Recommendations H. 261. It should be noted, however, that the refreshing unit is not limited to this block unit.
A conventional refresh control method as an example for refreshing all the blocks within one frame at a fixed period is classified into:
(1) a method refreshing all the blocks within one frame in a frame timing; and PA1 (2) a method refreshing one or plural blocks in one frame timing and shifting the position of the refreshing block (the block being refreshed at a given time) in accordance with frame update. PA1 (3) a method dividing one frame into two areas or more and refreshing the respective areas at different periods corresponding to the respective areas.
Another example of the refresh control method is:
In a prediction mode switchover control method for a moving image signal coding apparatus which encodes moving image signals using conventional inter-frame prediction and intra-frame prediction, correlation between a block to be encoded in a current frame and a block at a corresponding position in a previous frame is calculated, and the prediction mode is switched over in accordance with the obtained correlation amount. More specifically, if the correlation amount is equal to or more than a predetermined threshold value, the prediction mode is switched over to an inter-frame prediction mode, while the prediction mode is switched over to an intra-frame prediction mode if the correlation amount is less than the threshold value.
This calculation of the correlation between the blocks differs depending upon definition of the correlation. As an example, the correlation is obtained by calculating a difference between pixels of two blocks which are at corresponding positions and by calculating the reciprocal of the distribution of the calculated difference value.
Generally, coding amount increases if refreshing is performed upon encoding of a moving image signal. For example, in the aforementioned method (1), the coding amount within a frame being refreshed suddenly increases, disturbing moving image communication at a constant frame rate.
In the methods (1) and (2), as the refreshing is performed at all the positions within one frame at one period, the refreshing in a position of higher-interest which attracts a viewer's attention, such as a central portion of the frame, and the refreshing in a position of lower-interest, such as a circumferential portion of the frame, are performed in the same manner. A conceivable problem is that the refreshing in the lower-interest position is wasted because the improved image quality of the circumferential portion does not substantially influence the overall image quality.
In the method (3), the higher-interest area can be refreshed at a short period and the lower-interest area can be refreshed at a long period. Compared with the methods (1) and (2), the method (3) provides reproduced images of subjectively high quality. However, in this method, as the one frame is divided into two areas or more, and the refresh control is performed at different periods in the respective areas, the control method tends to be complicated.
FIGS. 2A and 2B illustrate the concept of the method (3). FIG. 2A shows an example where the frame is divided into two rectangular areas, assuming that the central portion of the frame is the higher-interest area. In this case, the circumferential area is doughnut-shaped, complicating the calculation of the block addresses included in this area.
In order to simplify the block address calculation, the circumferential area should be further divided into four areas (FIG. 2B I-1 to I-4). In this case, though each address calculation can be simplified, the number of areas which was initially two has increased into five. At last, the refresh control becomes complicated as a whole.
Further, in the aforementioned conventional prediction mode switchover control method, the prediction mode is changed over for all the blocks in the respective frames under the same condition. This type of coding control method which does not depend upon frame or block position has high versatility; however, it cannot be an appropriate control method for a specific practical application. For example, in visual telephones, generally communication is performed between two persons. A person at one end watches the face of another person at the other end displayed at the central portion of the screen, and seldom pays attention to the circumferential portion of the screen image.
The transmission speed of communication lines used in the visual telephones is usually 64 Kbps-128 Kbps, which is not sufficient to obtain high image quality throughout the whole frame. If the entire frame is encoded under the same condition in such lower-bit rate transmission, the image quality of the whole reproduced image is degraded. Especially, image quality immediately after starting of communication or image quality in several frames immediately after a scene change operation are degraded because the correlation between the frames becomes extremely small after such operations. For this reason, in the conventional prediction mode switchover control method, the prediction mode is set to intra-frame prediction mode in almost all the blocks within the frames.
However, coding amount tends to increase more in the intra-frame prediction mode, as compared with the amount in an inter-frame prediction mode. In the CCITT Recommendations H. 261 as a standard moving image signal coding method for teleconference systems/visual telephones, the coding amount within one block can be reduced to almost zero in the inter-frame prediction mode. However, in the intra-frame prediction mode, there is the lower limit of coding amount which is necessary in one block, and for this reason, if the intra-frame prediction mode is successively selected, the coding amount exceeds the communication capacity.
In order to avoid such problem, in the Recommendation H. 261, it is arranged that output from an encoder is temporarily stored into a transmission buffer, and the coding is controlled so that the transmission buffer is not allowed to overflow. For this reason, in a frame immediately after a scene change operation, the intra-frame prediction mode is successively selected; however, at a point in time where the transmission buffer comes to overflow, the inter-frame prediction mode is selected to forcibly keep prediction errors between the frames zero and prevent generation of coded data.
This forcible selection of the inter-frame prediction mode continues till the amount of the coded data in the transmission buffer becomes equal to or lower than a predetermined level. As a result, when the scene change operation is performed at the lower-bit rate coding, many blocks exist where images before the scene change remain as they are over several frames right after the scene change operation. Further, these blocks are at irregular positions, thus lowering the quality of a reproduced image.