1. Field of the Invention
The present invention relates to an encoding device and a decoding device for motion vector data of a moving image.
2. Description of the Related Art
Since an amount of data of a moving image is normally large, the data is encoded with a high efficiency coding when being transferred from a transmitting device to a receiving device or when being stored in a storage device. Here, the high efficiency coding is an encoding process for converting a data sequence into a different data sequence, and for compressing its data amount.
As a high efficiency coding method for moving image data, an interframe predictive coding is known. This coding method takes advantage of the nature that the degree of correlation of moving image data is high in a time direction. Namely, the degree of similarity between frame data of moving image data at certain timing and that at the next timing is normally high in many cases. Therefore, the interframe predictive coding utilizes this nature. For example, in a data transmission system using the interframe predictive coding, a transmitting device generates motion vector data which represents a motion from an image in a preceding frame to an image in a target frame, and difference data (predictive error) between a predicted image in the target frame, which is generated from the image in the preceding frame by using the motion vector data, and an image in the target frame. The transmitting device then outputs the motion vector data and the difference data to a receiving device. The receiving device reproduces the image in the target frame from the received motion vector data and difference data.
If the degree of correlation between the target and preceding frames is high in the above described encoding process, the amounts of information of the motion vector data and the difference data become small.
The above described interframe predictive coding is employed by the standard methods such as the ITU-T H.261, ITU-T H.263, ISO/IEC MPEG-1, ISO/IEC MPEG-2, etc. Additionally, these standard methods utilize predictive coding as a method for encoding motion vector data. Hereinafter, a method for encoding motion vector data will be explained by citing the ITU-T H.263 as an example.
With a predictive coding, an image in each frame is partitioned into a plurality of blocks (B11, B12, B13, B14, . . . ), and image data is encoded for each of the blocks. That is, an image similar to that in a target block is extracted from the image in the preceding frame, and the difference between the extracted image and the image in the target block is obtained for each of the blocks. In this way, differential image data from which redundancy is removed can be obtained. Also the motion vector data of the target block is obtained at this time. Then, data to be transmitted is compressed by encoding the differential image data and the motion vector data for each of the blocks.
When the motion vector data of a certain block (a target block to be encoded) is encoded, a predicted value of the motion vector (hereinafter referred to as a predictive vector) of the target block to be encoded is first obtained based on motion vectors of blocks adjacent to the target block. Here, blocks which have already been encoded are selected as the adjacent blocks used for this prediction. Normally, the encoding process is started from the block at the upper left corner, and is performed for each block in each line as shown in FIG. 1. When a certain block is encoded in this case, the encoding process has already been performed for the blocks in the line above this block and the block at the left thereof. Accordingly, for example, when the motion vector of a block B22 is encoded, the motion vectors of blocks B11 through B21 can be used.
When a motion vector of a target block to be encoded is predicted with the ITU-T H.263, the block above the target block, the block at the upper right, and the block at the left are used. By way of example, when the motion vector of the block B22 shown in FIG. 1 is encoded, the motion vectors of the blocks B12, B13, and B21 are used.
After the predictive vector of the target block to be encoded is obtained, the difference vector (or a prediction error vector) between a motion vector of the target block and its predictive vector is obtained. Then, the X and Y components of the difference vector are respectively encoded by using variable-length codes. The variable-length codes are, for example, Huffman codes.
A specific example will be given by referring to FIG. 2. This figure assumes that a motion vector of a target block to be encoded is (MVx, MVy), and respective motion vectors of adjacent blocks B1 through B3 used to obtain a predictive vector of the target block are respectively (PMV1x, PMV1y), (PMV2x, PMV2y), and (PMV3x, PMV3y). Here, the X component of the predictive vector of the target block is obtained as a median value of PMV1x, PMV2x, and PMV3x, while its Y component is obtained as a median value of PMV1y, PMV2y, and PMV3y. Then, difference vector data (the X and the Y components of the difference vector) are obtained by the following equations.
            difference      ⁢                          ⁢      vector      ⁢                          ⁢      data      ⁢                          ⁢              (        x        )              =                  MV        x            -              Median        ⁡                  (                                    PMV1              x                        ,                          PMV2              x                        ,                          PMV3              x                                )                                difference      ⁢                          ⁢      vector      ⁢                          ⁢      data      ⁢                          ⁢              (        y        )              =                  MV        y            -              Median        ⁡                  (                                    PMV1              y                        ,                          PMV2              y                        ,                          PMV3              y                                )                    
Each of difference vector data is encoded by using the variable-length codes shown in FIG. 3. The codes shown in FIG. 3 are the ones used by the ITU-T H.263.
For these codes, a data sequence having a short data length is assigned to difference vector data whose occurrence frequency is high, while a data sequence having a long data length is assigned to difference vector data whose occurrence frequency is low. The occurrence frequencies of difference vector data are statistically obtained in advance. Since use of such codes increases the probability that motion vector data having a short data length is transmitted, an average amount of information of motion vector data in each block decreases.
As described above, in a transmission system using an encoding method such as the ITU-T H.263, etc., data relating to a motion vector is compressed by using a predictive vector and the amount of information to be transmitted becomes small, which leads to an increase of a transmission efficiency.
For the codes which are widely used by existing predictive coding, a data sequence having a short data length is assigned to small difference vector data as shown in FIG. 3. In a scene where there is little or no motion or in a scene where an image changes uniformly, the prediction accuracy of a predictive vector becomes high and the length of difference vector data becomes short. Accordingly, the amount of information of encoded motion vector data becomes small on these scenes.
A specific example will be given by referring to FIGS. 4A and 4B. FIG. 4A exemplifies motion vectors in a scene where there is little or no motion. This figure assumes that a motion vector of a target block to be encoded is (1, 0), and motion vectors of blocks B1 through B3, which are adjacent to the target block, are respectively (0, 0), (0, 0), and (1, 0). In this case, the X and the Y components of the predictive vector of the target block are respectively obtained by the following equations.predictive vector(x)=Median (0, 0, 1)=0predictive vector(y)=Median (0, 0, 0)=0
Accordingly, “predictive vector”=(0, 0) is obtained.
Furthermore, the difference vector of the target block to be encoded is obtained by the following equation.
                              difference          ⁢                                          ⁢          vector                =                              motion            ⁢                                                  ⁢            vector            ⁢                                                  ⁢            of            ⁢                                                  ⁢            the            ⁢                                                  ⁢            target            ⁢                                                  ⁢            block                    -                      predictive            ⁢                                                  ⁢            vector                                                  =                              (                          1              ,              0                        )                    -                      (                          0              ,              0                        )                                                  =                  (                      1            ,            0                    )                    
For “difference vector data (difference vector component)=1”, “0010” is obtained as encoded motion vector data if the codes shown in FIG. 3 are used. For “difference vector data=0”, “1” is obtained as the encoded motion vector data. Accordingly, the encoded motion vector data to be transmitted for the target block is 5 bits.
As described above, in the scene where there is little or no motion, the difference vector data becomes small, so that also the amount of information of encoded motion vector data to be transmitted becomes small.
FIG. 4B exemplifies motion vectors in a scene where an image changes almost uniformly across frames. This figure assumes that a motion vector of a target block to be encoded is (10, −9), and motion vectors of blocks B1 through B3, which are adjacent to the target block, are respectively (10, −10), (9, −9), and (9, −9). In this case, “difference vector=(1, 0)” is obtained. Accordingly, even in the scene where an image changes uniformly, the difference vector data becomes small, so that also the amount of information of encoded motion vector data to be transmitted becomes small.
In a scene where an image does not change uniformly across frames, however, the prediction accuracy of a predictive vector becomes low and the difference vector data becomes large. Accordingly, the amount of information of encoded motion vector data to be transmitted becomes large on such a scene. Next, a specific example will be given by referring to FIG. 5.
FIG. 5 assumes that a motion vector of a target block to be encoded is (4, 2), and motion vectors of blocks B1 through B3, which are adjacent to the target block, are respectively (−10, 4), (−10, −10), and (4, −10). In this case, a predictive vector of the target block is obtained by using the motion vectors of the adjacent blocks as follows.
            predictive      ⁢                          ⁢      vector      ⁢                          ⁢              (        x        )              =                  Median        ⁢                                  ⁢                  (                                    -              10                        ,                          -              10                        ,            4                    )                    =              -        10                        predictive      ⁢                          ⁢      vector      ⁢                          ⁢              (        y        )              =                  Median        ⁢                                  ⁢                  (                      4            ,                          -              10                        ,                          -              10                                )                    =              -        10            
Consequently,predictive vector=(−10, −10)
The difference vector of the target block is obtained by the following equation.
                              difference          ⁢                                          ⁢          vector                =                              motion            ⁢                                                  ⁢            vector            ⁢                                                  ⁢            of            ⁢                                                  ⁢            target            ⁢                                                  ⁢            block                    -                      predictive            ⁢                                                  ⁢            vector                                                  =                              (                          4              ,              2                        )                    -                      (                                          -                10                            ,                              -                10                                      )                                                  =                  (                      14            ,            12                    )                    
For “difference vector data=12”, “00000001000” is obtained as the motion vector data to be transmitted if the codes shown in FIG. 3 are used. Similarly, for “difference vector data=14”, “000000001000” is obtained as the motion vector data to be transmitted. Accordingly, the encoded motion vector data to be transmitted for the target block is 23 bits. As described above, in the scene where an image does not change uniformly, the difference vector data becomes large, so that also the amount of information of the encoded motion vector data to be transmitted becomes large.
As described above, moving image data is compressed with predictive coding in order to increase a transmission efficiency. However, its compression ratio is not sufficiently high depending on the nature of a moving image.