In the age of multimedia which integrally handles audio, video and other information, existing information media, i.e., newspapers, magazines, televisions, radios, telephones and other means through which information is conveyed to people, have recently come to be included in the scope of multimedia. Generally, multimedia refers to something that is represented by associating not only characters, but also graphics, voices, and especially pictures and the like together, but in order to include the aforementioned existing information media in the scope of multimedia, it appears as a prerequisite to represent such information in digital form.
However, when calculating the amount of information contained in each of the aforementioned information media as the amount of digital information, while the amount of information per character is 1˜2 bytes, the amount of information to be required for voice is 64 Kbits or over per second (telephone quality), and 100 Mbits or over per second for moving pictures (current television reception quality), and it is not realistic for the aforementioned information media to handle such an enormous amount of information as it is in digital form. For example, although video phones are already in actual use via Integrated Services Digital Network (ISDN) which offers a transmission speed of 64 Kbps/s˜1.5 Mbps/s, it is not practical to transmit video shot by television cameras directly through ISDN.
Against this backdrop, information compression techniques have become required, and moving picture compression techniques compliant with H.261 and H.263 standards internationally standardized by ITU-T (International Telecommunication Union-Telecommunication Standardization Sector) are employed for video phones, for example (See, for example, Information technology—Coding of audio-visual objects—Part 2: video (ISO/IEC 14496-2), pp. 146-148, 1999. 12. 1). Moreover, according to information compression techniques compliant with the MPEG-1 standard, it is possible to store picture information in an ordinary music CD (compact disc) together with sound information.
Here, MPEG (Moving Picture Experts Group) is an international standard on compression of moving picture signals, and MPEG-1 is a standard for compressing television signal information approximately into one hundredth so that moving picture signals can be transmitted at a rate of 1.5 Mbps. Furthermore, since transmission speed within the scope of the MPEG-1 standard is limited primarily to about 1.5 Mbps, MPEG-2, which was standardized with a view to satisfy requirements for further improved picture quality, allows data transmission of moving picture signals at a rate of 2˜15 Mbps. Furthermore, MPEG-4 which achieves a higher compression ratio than that of MPEG-1 and MPEG-2, allows coding, decoding and operation in an object unit, and realizes a new function required for the multimedia age, has been standardized by the working group (ISO/IEC JTC1/SC29/WG11) which has been engaged in the standardization of MPEG-1 and MPEG-2. MPEG-4 was initially aimed at standardization of a coding method for a low bit rate, but now it is extended to standardization of a more versatile coding method for moving pictures further including interlace images and higher bit rates.
In the above-mentioned moving picture coding, the amount of information is compressed by exploiting redundancies in the spatial and temporal directions. Here, inter picture prediction coding is used as a method of using the temporal redundancies. In the inter picture prediction coding, a picture is coded using a temporarily forward or backward picture as a reference picture. The motion (a motion vector) of the current picture to be coded from the reference picture is estimated, and the difference between the picture obtained by the motion compensation and the current picture is calculated. Then, the spatial redundancies are eliminated from this difference, so as to compress the information amount of the moving picture.
In a moving picture coding method in compliance with MPEG-1, MPEG-2, MPEG-4, H.263, H.26L or the like, a picture which is not inter picture prediction coded, namely, which is intra picture coded, is called an I-picture. Here, a picture means a single coding unit including both a frame and a field. Also, a picture which is inter picture prediction coded with reference to one picture is called a P-picture, and a picture which is inter picture prediction coded with reference to two previously processed pictures is called a B-picture.
FIG. 1 is a diagram showing a predictive relation between pictures in the above-mentioned moving picture coding method.
In FIG. 1, a vertical line indicates one picture, with a picture type (I, P or B) indicated at the lower right thereof. Also, FIG. 1 indicates that a picture pointed by an arrow is inter picture prediction coded using a picture located at the other end of the arrowhead as a reference picture. For example, a B-picture which is the second from the left is coded using the first I-picture and the fourth P-picture as reference pictures.
In the moving picture coding method in compliance with MPEG-4, H.26L or the like, a coding mode called direct mode can be selected for coding a B-picture.
An inter picture prediction coding method in direct mode will be explained with reference to FIG. 2.
FIG. 2 is an illustration for explaining the inter picture prediction coding method in direct mode.
It is now assumed that a block C in a picture B3 is coded in direct mode. In this case, a motion vector MVp of a block X in a reference picture (a picture P4 that is a backward reference picture, in this case) which has been coded immediately before the picture B3 is exploited, where the block X is co-located with the block C. The motion vector MVp is a motion vector which was used when the block X was coded, and refers to a picture P1. The block C is bi-directionally predicted from the reference pictures, namely, the picture P1 and the picture P4, using motion vectors parallel to the motion vector MVp. The motion vectors used for coding the block C are, in this case, a motion vector MVFc for the picture P1 and a motion vector MVBc for the picture P4.
In the moving picture coding method in compliance with MPEG-4, H.26L or the like, a difference between a predictive value obtained from motion vectors of neighboring blocks and a motion vector of a current block to be coded is coded for coding the motion vector. In the following description, a “predictive value” indicates a predictive value of a motion vector. Since motion vectors of neighboring blocks have similar direction and motion in many cases, the amount of coding the motion vector can be reduced by coding the difference from the predictive value obtained from the motion vectors of the neighboring blocks.
Here, a motion vector coding method in MPEG-4 will be explained with reference to FIGS. 3A-3D.
FIGS. 3A-D are illustrations for explaining a method for coding a motion vector MV of a current block A to be coded in MPEG-4.
In FIGS. 3A˜3D, blocks indicated by a thick line are macroblocks of 16×16 pixels, and there exist 4 blocks of 8×8 pixels in each macroblock. Here, it is assumed that a motion vector is obtained at a level of a block of 8×8 pixels.
As shown in FIG. 3A, as for a current block A located at the upper left in a macroblock, a difference between a predictive value and a motion vector MV of the current block A is coded, where the predictive value is calculated from a motion vector MVb of a neighboring block B to the left of the current block A, a motion vector MVc of a neighboring block C just above the current block A and a motion vector MVd of a neighboring block D above and to the right of the current block A.
Similarly, as shown in FIG. 3B, as for a current block A located at the upper right in a macroblock, a difference between a predictive value and a motion vector MV of the current block A is coded, where the predictive value is calculated from a motion vector MVb of a neighboring block B to the left of the current block A, a motion vector MVc of a neighboring block C just above the current block A and a motion vector MVd of a neighboring block D above and to the right of the current block A.
As shown in FIG. 3C, as for a current block A located at the lower left in a macroblock, a difference between a predictive value and a motion vector MV of the current block A is coded, where the predictive value is calculated from a motion vector MVb of a neighboring block B to the left of the current block A, a motion vector MVc of a neighboring block C just above the current block A and a motion vector MVd of a neighboring block D above and to the right of the current block A.
As shown in FIG. 3D, as for a current block A located at the lower right in a macroblock, a difference between a predictive value and a motion vector MV of the current block A is coded, where the predictive value is calculated from a motion vector MVb of a neighboring block B to the left of the current block A, a motion vector MVc of a neighboring block C above and to the left of the current block A and a motion vector MVd of a neighboring block D just above the current block A. Here, the predictive value is calculated using the medians obtained from the horizontal and vertical components of these three motion vectors MVb, MVc and MVd respectively.
Next, a motion vector coding method in H.26L which has been developed for standardization will be explained with reference to FIG. 4.
FIG. 4 is an illustration for explaining a method for coding a motion vector MV of a current block A in H.26L.
A current block A is a block of 4×4 pixels, 8×8 pixels or 16×16 pixels, and a motion vector of this current block A is coded using a motion vector of a neighboring block B including a pixel b located to the left of the current block A, a motion vector of a neighboring block C including a pixel c located just above the current block A and a motion vector of a neighboring block D including a pixel d located above and to the right of the current block A. Note that the sizes of the neighboring blocks B, C and D are not limited to those as shown in FIG. 4 by dotted lines.
FIG. 5 is a flowchart showing the procedure of coding the motion vector MV of the current block A using the motion vectors of the neighboring blocks as mentioned above.
First, the neighboring block which refers to the picture that the current block A refers to is specified out of the neighboring blocks B, C and D (Step S502), and the number of specified neighboring blocks is determined (Step S504).
When the number of the neighboring blocks determined in Step S504 is 1, the motion vector of that neighboring block which refers to the same picture is considered to be a predictive value of the motion vector MV of the current block A (Step S506).
When the number of the neighboring blocks determined in Step S505 is another value other than 1, the motion vector of the neighboring block which refers to another picture other than the picture that the current block A refers to, out of the neighboring blocks B, C and D, is considered to be 0 (Step S507). And the median of the motion vectors of the neighboring blocks B, C and D is considered to be a predictive value of the motion vector of the current block A (Step S508).
Using the predictive value derived in Step S506 or Step S508 in this manner, the difference between the predictive value and the motion vector MV of the current block A is calculated and the difference is coded (Step S510).
As described above, in the motion vector coding methods in compliance with MPEG-4 and H.26L, motion vectors of neighboring blocks are exploited when coding a motion vector of a current block to be coded.
However, there are cases where motion vectors of neighboring blocks are not coded. For example, they are cases where a neighboring block is intra picture coded, a B-picture is coded in direct mode, and a P-picture is coded in skip mode. In these cases, the neighboring blocks are coded using the motion vectors of other blocks except when they are intra picture coded, namely, the neighboring blocks are coded using their own motion vectors based on the result of motion estimation.
So, according to the above-mentioned traditional motion vector coding method, a motion vector of a current block is coded as follows: When there exists one neighboring block, out of three neighboring blocks, which has no motion vector based on the above result of motion estimation and has been coded using motion vectors of other blocks, the motion vector of that neighboring block is considered to be 0. When there exist two such neighboring blocks, the motion vector of the remaining one neighboring block is used as a predictive value. And when there exist three neighboring blocks, the motion vector is coded considering a predictive value to be 0.
However, in direct mode or skip mode, motion compensation is actually performed as is the case where a motion vector of a neighboring block itself is used based on the estimation result, although the motion vector information is not coded. As a result, in the above traditional method, if a neighboring block is coded in direct mode or skip mode, the motion vector of the neighboring block is not used as a candidate for a predictive value. So, there is a problem of causing an inaccurate predictive value of a motion vector when coding the motion vector, and thus causing lower coding efficiency.
The present invention is conceived to solve this problem, and the object thereof is to provide a motion vector coding method and a motion vector decoding method for obtaining a more accurate predictive value for higher coding efficiency.