1. Field of the Invention
This invention relates to a moving image encoding method and apparatus which is suitable in recording or reproducing a moving image signal to or from a recording medium such as a magneto-optical disc or magnetic tape to display it on a display, or in transmitting a moving image signal from a transmitting side to a receiving side by way of a transmission line such as video conference systems, video telephone systems, or equipments for broadcasting. The invention also relates to a moving image decoding method and apparatus for reproducing a moving image signal, from a signal obtained by motion compensation predictive coding.
2. Description of the Related Art
Heretofore, in a case where a moving image is digitalized, recorded or transmitted, coding (compressing) of data has been performed because data size becomes massive. As a representative coding method, there is motion compensation predictive coding.
FIG. 1 shows the principles of the motion compensation predictive coding. The motion compensation predictive coding is a method which makes use of a correlation of the time axis direction of video signals. That is, the motion compensation is a method where the present motion vector of an object to be coded is estimated from a video signal which has already been decoded and reproduced, the decoded and reproduced video signal is moved in accordance with the motion of a signal, and the movement data (motion vector) and a prediction error obtained at that time are transmitted, so as to compress data size necessary for coding. is compressed by transmitting data of this movement (motion vector) and a prediction error obtained at that time.
Moving Picture Expert Group (MPEG) is known as a representative of this motion compensation predictive coding. The MPEG is a popular name of the moving image encoding collected together in the Working Group (WG) 11 of Sub Committee (SC) 29 of Joint Technical Committee (JTC) 1 of ISO and IEC.
In the MPEG, one picture (frame or field) is divided into small units called a macroblock which is constituted by 16 lines xc3x9716 pixels, and motion compensation predictive coding is performed at units of this macroblock. The motion compensation predictive coding is roughly grouped into two methods: intra coding and non-intra coding. The intra coding is a coding method which uses only information of a self-macroblock, and the non-intra coding is a coding method which uses both the information of a self-macroblock and information obtained from a picture which appears at another time.
In the MPEG, each frame picture is coded as any of three kinds of pictures: an intra coded picture (I-picture), a predictive coded picture (P-picture), and a bidirectionally predictive coded picture (B-picture). That is, as shown in FIGS. 2A and 2B for example, video signal of 17 frames of frames F1 to F17 is considered as a group of pictures (GOP) which is one unit of processing.
As shown in FIGS. 2A and 2B, the video signal of first frame F1 of the GOP is coded as an I-picture, that of second frame F2 is coded as a B-picture, and that of third frame F3 is coded as a P-picture, for example. The frames F4 to F17 are alternately processed as a B-picture or a P-picture. In FIGS. 2A and 2B, an arrow from one picture to another represents a direction of prediction (the same shall apply hereinafter).
For the video signal of the I-picture, the video signal of one frame, as it is, is coded and transmitted. For the video signal of the P-picture, as shown in FIG. 2A, basically there is coded and transmitted a difference between the video signal of the P-picture and either of the video signal of the I-picture or the past P-picture being in the past point of time. Also, for the video signal of the B-picture, as shown in FIG. 2B, basically there is coded and transmitted a difference between the video signal of the B-picture and either of a frame being in the past point of time and a frame being in the future point of time, or there is coded and transmitted a difference between the video signal of the B-picture and both of a frame being in the past point of time and a frame being in the future point of time.
The principles of a method of coding a moving image signal are shown in FIGS. 3A and 3B. As shown in FIGS. 3A and 3B, since the first frame F1 is processed as an I-picture, all of the macroblocks are intra-coded and transmitted as transmission data F1X to a transmission line. For the frame F3 of the P-picture, with the past frame F1 being in the past point of time as a reference picture, a prediction error (SP3) from the frame F1 is calculated, and transmitted as transmission data F3X, together with a motion vector x3 (forward predictive coding). In this case, the original data of the frame F3, as it is, is transmitted as transmission data F3X (SP1)(intra coding). These methods can be switched at units of a macroblock.
For the frame F2 of the B-picture, there is calculated a prediction error between the frame F2 and either or both of the frame Fl being in the past point of time and the frame F3 being in the future point of time, and this is transmitted as transmission data F2X. For the process of this B-picture, there are four kinds of processes at a macroblock unit: (1) intra mode (intra coding), (2) forward predictive mode (forward predictive coding), (3) backward predictive mode (backward predictive coding), and (4) bidirectionally predictive mode (bidirectionally predictive coding).
The process in the intra mode is the process (SP1) of transmitting the data of the original frame F2, as it is, as transmission data F2X, and is the same process as the case of the I-picture. The process in the forward predictive mode is the process of transmitting a prediction error SP3 obtained from the reference frame F1 being in the past point of time and also transmitting the motion vector x1 (motion vector between the frames F1 and F2). The process of the backward predictive mode is the process of calculating a prediction error (SP2) with the reference frame F3 being in the future point of time and transmitting the error (SP2) and the motion vector x2 (motion vector between the frames F3 and F2).
The process in the bidirectionally predictive mode is the process of obtaining a prediction error SP4 from an average value of two prediction pictures obtained from both of the past reference frame F1 and the future reference frame F3 and also transmitting this error as transmission data F2X, together with the motion vectors x1 and x2. For the B-picture, the aforementioned four kinds of methods can be switched at units of a macroblock. Among these methods, the processes of the forward predictive mode, the backward predictive mode, and the bidirectionally predictive coding are a non-intra coding method.
The moving image encoding apparatus should select a method whose coding efficiency is best among the aforementioned four modes, in coding the macroblock of the B-picture. Ideally, it is desirable that the macroblock be coded with four kinds of methods and then a method where the size of transmission data is least be selected. However, this method has the problem that the scale of the hardware becomes large.
As a method for solving this problem, there has been proposed a method in U.S. patent Ser. No. 08/123560, filed on Sep. 17, 1993, now (U.S. Pat. No. 5,461,420, issued on Dec. 24, 1995) in the process (motion estimation: ME) of estimating forward and backward motion vectors of a macroblock, forward and backward motion vector estimation errors (ME errors) are obtained and, based on these values, the non-intra predictive coding of the macroblock is selected.
The motion vector estimation error is obtained by, for example, calculating the sum of absolute value of a difference in each pixel between a prediction macroblock obtained from the motion vector and a macroblock of an object to be coded. This motion vector estimation error is obtained for both the forward vector and the backward vector. The selection method of the non-intra coding at this time will be explained based on FIG. 4.
In FIG. 4, if the estimation error of the forward motion vector and the estimation error of the backward motion vector are expressed by Ef and Eb, respectively, then non-intra coding will be selected as follows. That is, (1) in the case of Eb greater than jxc3x97Ef, the forward predictive mode is selected. (2) In the case of Eb less than kxc3x97Ef, the backward predictive mode is selected. (3) In the case of kxc3x97Efxe2x89xa6Ebxe2x89xa6jxc3x97Ef, the bidirectionally predictive mode is selected. In these equations, xe2x80x9cjxe2x80x9d and xe2x80x9ckxe2x80x9d are, for example, j=2 and k=xc2xd.
In this selection method, when the estimation error Ef of the forward motion vector is relatively small (e.g., half) as compared with the estimation error Eb of the backward motion vector, the forward predictive mode is selected, and when the estimation error Eb of the backward motion vector is relatively small (e.g., half) as compared with the estimation error Ef of the forward motion vector, the backward predictive mode is selected. In the case other than these cases, the bidirectionally predictive mode is selected.
Incidentally, in the case of a prediction structure where a single B-picture exists between I- and P-pictures or between P-pictures, such as that shown in FIGS. 2A and 2B, the aforementioned selection method of the non-intra coding can obtain a satisfactory result. However, for example, as shown in FIGS. 5A and 5B, in the case of a prediction structure where two or more B-pictures exist between I- and P-pictures or between Ppictures, if the selection method of the non-intra coding is applied, there will be the problem that coding efficiency is not good.
That is, as compared with the method where all kinds of non-intra coding are tried and then non-intra coding where data size becomes most minimum is selected, in the aforementioned selection method of the non-intra coding there are many cases where the bidirectionally predictive mode is selected by mistake. In other words, there was the problem that, even in a case coding efficiency is bad, the bidirectionally predictive coding mode is selected.
On the other hand, in a case where the coded data obtained by coding a moving image signal by motion compensation predictive coding is decoded, the information for motion compensation and the prediction error is decoded from the transmitted coded data, the reference picture indicated by the motion compensation information is moved based on the motion vector, and the prediction error is added to the moved reference picture, thereby reproducing the moving image.
More specifically, a method of decoding motion compensation coded data is described with FIGS. 5A and 5B of a prediction structure where two or more B-pictures exist between I- and P-pictures or between P-pictures. Initially, the coded data of the I-picture shown at the frame F1 is received. The I-picture is coded without referring to other pictures, because, as described above, all macroblocks have been coded by intra coding, i.e., they have been coded only by the self-video signal.
Next, the coded data of the P-picture shown at the frame F4 is received. Since basically the P-picture has been coded with the motion compensation prediction from the I- or P-picture being in the past, as shown in FIG. 2A, the P-picture is motion-compensated and decoded by using as a reference picture the decoded picture obtained by decoding the coded data of the past I- or P-picture. In the case of this frame F4, the P-picture is motion-compensated and decoded with the decoded picture of the frame F1 as a reference picture.
Next, the coded data of the B-pictures are received in order of frame F2 and frame F3. Since the B-pictures have been coded by the motion compensation prediction where pictures being in the past and future points of time are used as a reference picture, as shown in FIG. 2B, the B- pictures are motion-compensated and decoded by using as a reference picture the decoded pictures obtained by decoding the coded data of the past and future frames. In this case, the B-pictures are motion-compensated and decoded with both of the frames F1 and F4 as a reference picture. Note that these B-pictures are by no means used as a reference picture for motion compensation.
Subsequently, frames F5 to F16 are decoded according to the type of the picture in the aforementioned same way.
Incidentally, if there occurs a case such as some of the motion compensation coded data are being dropped and, for example, it becomes impossible to decode the motion compensation information and the prediction error of the macroblock, then a picture of that portion will be dropped and picture quality will be considerably deteriorated from the point of visual sensation. To make such deterioration in picture quality inconspicuous, normally an error correction is made in decoding.
As a conventional method for this error correction, there is a method where, for example, when a macroblock of an arbitrary picture is dropped, the dropped macroblock is replaced with a macroblock being in the same position in a past reference picture of motion compensation, as shown in FIGS. 6A to 6C. That is, in FIGS. 6A to 6C, B-pictures B2 and B3 are between P-pictures P1 and P4, and it is assumed that, when pictures are normally decoded, pictures such as those shown in FIG. 6A are obtained. In this case, when an error occurs in the hatched portion of the B-picture B2, as shown, for example, in FIG. 6B, the macroblocks of the hatched portion of the B-picture B2 are replaced with the macroblocks of the hatched portion of the P-picture P1 corresponding to the hatched portion of the B-picture B2. More specifically, for macroblocks where an error is corrected (i.e., hatched portion of the B-picture B2), the motion vector is reset to xe2x80x9c0xe2x80x9d, the prediction error is set to xe2x80x9c0xe2x80x9d, and a motion compensation is made with the past reference picture (i.e., hatched portion of the P-picture P1). In this way, deterioration in picture quality was made inconspicuous.
However, in a case where the error correction of the B-picture was performed by the conventional error correction technique, there were some cases where the motion of the moving image became unnatural. For example, when the pictures P1 and P4 are correctly decoded and then the decoding of the macroblocks of the hatched portion of the B-picture B3 becomes impossible, as shown in FIG. 6C, the error correction of the lost macroblocks of the hatched portion of the aforementioned B-picture B3 is to be performed with the macroblocks of the hatched portion of the P-picture P1 which is the past reference picture, as described above. However, in this case, although the moving images are moving in order of P-picture P1, B-picture B2, B-picture B3, and P-picture P4, the hatched portion of the B-picture B3 is to be replaced with the picture of the hatched portion of the P-picture P1. That is, in the case of FIG. 6C, if the aforementioned replacement of the pictures is performed, then the picture will go backward. Particularly, when the motion of the moving image is a horizontal motion such as a pan operation in the operation of a camera, the aforementioned backward motion becomes very conspicuous.
In view of the foregoing, an object of this invention is to provide a moving image encoding method and apparatus which is capable of increasing coding efficiency even in a case of a prediction structure where two or more B-pictures exist between I- and P-pictures or between P-pictures, and also a moving image decoding method and apparatus which is capable of rendering a motion of an error corrected moving image better when a moving image signal is reproduced from data obtained by coding the moving image signal by a motion compensation predictive coding method.
The foregoing object and other objects of the invention have been achieved by the provision of a moving image encoding method and apparatus and a moving image decoding method and apparatus. The moving image encoding method and apparatus for encoding video signal having a predetermined picture unit of moving image signal by using a predetermined predictive video signal, comprises the steps of: calculating a distance for a time preceding past reference picture, and also a distance for a time following future reference picture by the video signal of a predetermined picture unit; and selecting the motion compensation predictive coding which is applied to the video signal of a predetermined picture unit in accordance with the calculated distances.
Further, the moving image decoding method and apparatus of this invention, comprises the steps of: decoding the video signal of a predetermined picture unit by motion compensation from a coded signal obtained by coding video signal of a predetermined picture unit of a moving image signal with the aid of a predetermined predictive video signal; calculating a distance for a time preceding past reference picture, and also a distance for a time following future reference picture by the video signal of a predetermined picture unit, in reproducing the moving image signal; and selecting a motion compensation mode with respect to the error detected video signal of predetermined picture unit in accordance with the calculated distances.
That is, according to this invention, the distance for a time preceding past reference picture and the distance for a time following future reference picture have been calculated by the video signal of a predetermined picture unit. These distances correspond to the degree of correlation of the past reference picture to the video signal of a predetermined picture unit and the degree of correlation of the future reference picture to the video signal of a predetermined picture unit. Therefore, if, for example, a higher degree of correlation (shorter distance) is selected, an error of selection can be reduced. If the error of selection is reduced, the coding efficiency of a moving image can be further enhanced and also the motion of an error corrected moving image can be made better.
The nature, principle and utility of the invention will become more apparent from the following detailed description when read in conjunction with the accompanying drawings in which like parts are designated by like reference numerals or characters.