The present invention relates to highly efficient coding/decoding of moving picture to transform pictures into digital signals with less code amount and vice versa for effective moving picture transmission, storage and displaying. Particularly, this invention relates to highly efficient inter-picture predictive coding/decoding using pictures for bidirectional inter-picture prediction.
In inter-picture predictive coding of moving picture, highly efficient coding can be achieved using pictures as reference pictures that have already been coded and placed in time before and after pictures to be coded because a precise predictive signal can be produced. These pictures to be coded are called bidirectionally-predictive coded (B) frames (pictures).
This coding method is described in Japanese Patent Laid-Open No. 2(1990)-192378, "Interframe predictive coding system" filed by the same applicant of this invention. The B frames are also used in the moving picture coding system called MPEG that has been standardized by ISO/IEC.
The interframe predictive coding for coding the B frames always requires independent frames and one-directionally predictive frames. Because the B frames are frames always coded and never used for predicting other frames. The independent frames are called intra (I) frames that are coded without reference to other pictures. The one-directionally predictive frames are called predictive coded (P) frames that are coded from past intra or predictive coded frames.
An image (picture) signal is composed of a plurality of frames with the I and B frames or I, P and B frames as shown in FIGS. 1A and 1B, respectively.
FIG. 1A illustrates the I frames placed in time at every fourth frame (usually, every fourth to sixth frame). The B frames placed in time inbetween two I frames are predicted by the I frames placed in time before (past) and after (future) the B frames.
FIG. 1B illustrates many of the B frames shown in FIG. 1A being replaced with the P frames. Usually, the I frames are placed in time at every tenth to thirtieth frame and the P frames at every second to fourth frame. In case of FIG. 1B, the coding/decoding processing is complicated and the frame binding length, or the frame unit to be accessed randomly becomes long; however, there are many interframe predictive frames so that the coding efficiency will be improved.
FIG. 2 shows a block diagram of a conventional coding apparatus for processing I and B frames.
Image signals of moving picture are sequentially supplied to an image memory 2, a switch 4 and a picture type controller 7 via input terminal 1.
The picture type controller 7 sets the picture type of each frame by a predetermined period in synchronism with the input frames. More in detail, an every Nth frame is set as the I frame and the other frames the B frames. The picture type controller 7 then outputs the picture type information to the switch 4, an encoder 5 and a switch 11.
The image memory 2 stores an image signal of B frame among the image signals to delay the image signal by (N-1) frames. The delayed image signal of B frame is then supplied to a predictive subtracter 3. The delay is performed in order that an I frame used for prediction must be coded by the encoder 5 prior to the B frames to be predicted.
The predictive subtracter 3 subtracts a predictive signal from the image signal of B frame supplied from the image memory 2 to produce a predictive residual signal that is supplied to the switch 4. The predictive signal is supplied from an adder 10 as described later.
Under the control by the picture type information, the switch 4 selects and supplies the input image signal of I frame or the predictive residual signal to the encoder 5.
The encoder 5 transforms the selected signal by discrete cosine transform (DCT) and quantizes it to produce a variable-length coded signal. The coded signal is then output via output terminal 6.
In case of I frame, the coded signal of I frame is supplied to a local decoder 14 via switch 11 under the control by the picture type information.
The local decoder 14 performs dequantization and inverse-DCT to the coded signal of I frame to reproduce the image signal of I frame that is supplied to an image memory 13.
The image memory 13 stores and holds the reproduced image signal of I frame by one I frame and outputs the image signal to an image memory 12 when the next reproduced image signal of I frame is supplied thereto. The image memory 12 also stores and holds the reproduced image signal of I frame by one I frame. The image signals of I frame stored in the image memories 12 and 13 are supplied to the adder 10 for predictive coding of B frames.
The adder 10 adds pixel values of the image signals and divides the added result by two to produce the predictive signal that is supplied to the predictive subtracter 3.
The addition may be performed after the image signals are processed by weighting in accordance with the distance relationship between the I frames. Further, the adder 10 may not always perform addition in case of one-directional prediction per block or no prediction in accordance with similarity between a frame to be coded and frames to be used for prediction and placed in time before and after the frame to be coded. In this case, the encoder 5 decides the type of prediction in accordance with the similarity to output, per block, the prediction mode information.
FIG. 3 shows a block diagram of a conventional decoding apparatus corresponding to the coding apparatus of FIG. 2.
A coded signal and prediction mode information are supplied to a decoder 22 via input terminal 21. The decoder 22 performs dequantization and inverse-DCT to the coded signal to reproduce and decode the image signal of I frame or predictive residual signal of B frame. The decoded signal is supplied to a switch 23. The prediction mode information is also decoded and supplied to the switch 23 and further to a switch 25.
Under the control of the predictive mode information, the switch 23 supplies the decoded image signal of I frame to an image memory 13a.
The image memory 13a stores and holds the decoded image signal of I frame by one I frame and outputs the image signal to an image memory 12a and the switch 25 when the next decoded image signal of I frame is supplied thereto. The image memory 12a also stores and holds the image signal of I frame by one I frame.
The image signals of I frame stored in the image memories 12a and 13a are supplied to an adder 10a for predicting decoding of B frames. The adder 10a adds pixel values of the image signals and divides the added result by two to produce the predictive signal that is supplied to a predictive adder 24.
Under the control of the predictive mode information, the decoded predictive residual signal is also supplied to the predictive adder 24 via switch 23. The predictive residual signal and the predictive signal are added to each other by the predictive adder 24 to reproduce the image signal that is supplied to the switch 25.
Under the control of the predictive mode information, the switch 25 selects the image signal of I frame stored in the image memory 13a or the reproduced image signal from the predictive adder 24. The selected image signal is output via output terminal 26. Here, the output signal from the output terminal 26 is composed of frames, the frame order of which is returned to that of the image signal input to the coding apparatus of FIG. 2.
Not only the bidirectional prediction described above, there are prediction methods using a plurality of pictures.
One is disclosed in Japanese Laid-Open Patent No. 4(1992)-105487, "Interframe predictive coding system" filed by the same applicant of this invention. The coding system adds past two fields or frames to each other for prediction. No delay processing is required in this system and hence there is no delay due to change in coding order that will occur when B frames are predicted. Another one is called as Dual' system in MPEG-2 to add past even and odd fields or frames to each other for prediction.
As described above, the inter-picture prediction performs addition of a plurality of pictures. Because the bidirectional prediction can follow picture change in time in interpolative prediction. Further, not only the bidirectional prediction, but other inter-picture predictions perform the addition of a plurality of pictures to restrict noise components involved in an input picture and other noise components produced through quantization. The addition of a plurality of pictures further has an effect of spatial filtering to improve prediction efficiency.
However, the inter-picture prediction using a plurality of reference pictures for prediction has the following drawbacks.
The inter-picture prediction adds two pictures to produce a predictive signal for prediction of one picture. This is why the inter-picture prediction requires two pixels of two reference pictures per pixel of one picture to be coded.
This means that the signal amount read from image memories that store reference pictures in inter-picture prediction will be double compared to such signal amount in prediction not using reference pictures. The pixel transfer rate between the image memories and prediction circuitry in the former prediction thus will be double compared to such transfer rate in the latter prediction. This results in that the inter-picture prediction requires data buses as a hardware that carry data of many bits.