1. Field of the Invention
This invention can be used in low bit rate video coding for tele-communicative applications. It improves the temporal frame rate of the decoder output as well as the overall picture quality.
2. Related art of the Invention
In a typical hybrid transform coding algorithm such as the ITU-T Recommendation H.261 [1] and MPEG [2] motion compensation is used to reduce the amount of temporal redundancy in the sequence. In the H.261 coding scheme, the frames are coded using only forward prediction, hereafter referred to as P-frames. In the MPEG coding scheme, some frames are coded using bi-direction prediction, hereafter referred to as B-frames. B-frames improve the efficiency of the coding scheme. Now the [1] is ITU-T Recommendation H.261 (Formerly CCITT Recommendation H.261) Codes for audiovisual services at p×64 kbit/s Geneva, 1990 , and the [2] is ISO/IEC 11172-2 1993 , Information technology—Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s - Part 2: Video.
However, it introduces delay in the encoding and decoding, making it unsuitable for applications in the communicative services where delay is an important parameter. FIG. 1a and 1b illustrates the frame prediction of H.261 and MPEG as described above. A new method of coding involving the coding of the P and B frames as a single unit, hereafter referred to as the PB-frame, was introduced. In this scheme the blocks in the PB-frames are coded and transmitted together thus reducing the total delay. In fact the total delay should not be more than a scheme using forward prediction only but at half the frame rate.
FIG. 2a shows the PB-frame prediction. A PB-frame consists of two pictures being coded as one unit. The name PB comes from the name of picture types in MPEG where there are P-frames and B-frames. Thus a PB-frame consists of one P-frame which is predicted from the last decoded P-frame and one B-frame which is predicted both from the last decoded P-frame and the P-frame currently being decoded. This last picture is called B-frame because parts of it may be bi-directionally predicted from the past and future P-frame.
FIG. 2b shows the forward and bi-directional prediction for a block in the B-frame, hereafter referred to as a B-block. Only the region that overlaps with the corresponding block in the current P-frame, hereafter referred to as the P-block, is bi-directionally predicted The rest of the B-block is forward predicted from the previous frame. Thus only the previous frame is required in the frame store. The information from the P-frame is obtained from the P-block currently being decoded.
In the PB-block only the motion vectors for the P-block is transmitted to the decoder. The forward and backward motion vectors for the B-block is derived from the P motion vectors. A linear motion model is used and the temporal reference of the B and P frame is used to scale the motion vector appropriately. FIG. 3a depicts the motion vector scaling and the formula is shown below.MVF=(TRB×MV)/TRP   (1) MVB=((TRB−TRP)×MV)/TRP   (2) where                MV is the motion vector of the P-block,        MVF and MVB are the forward and backward motion vectors for the B-block,        TRB is the increment in the temporal reference from the last P-frame to the current B-frame, and        TRP is the increment in the temporal reference from the last P-frame to the current P-frame.        
Currently the method used in the prior art assumes a linear motion model. However this assumption is not valid in a normal scene where the motion is typically not linear. This is especially true when the camera shakes and when objects are not moving at constant velocities.
A second problem involves the quantization and transmission of the residual of the prediction error in the B-block Currently the coefficients from the P-block and the B-block are interleaved in some scanning order which requires the B-block efficients to be transmitted even when they are all zero. This is not very efficient as it is quite often that there are no residual coefficients to transmit (all coefficients are zero).