(1) Field of the Invention
The present invention is a motion vector estimation apparatus used for inter-picture prediction coding of a moving picture made up of plural pictures and relates to a motion vector estimation apparatus for estimating a position in another picture from which a block has moved, and expressing the motion with motion vectors.
(2) Description of the Related Art
Generally, in picture coding, an amount of information is compressed by using redundancies in spatial direction and temporal direction in moving pictures. Thus, inter-picture prediction coding is used as a method of utilizing redundancies in temporal direction. When coding a picture with inter-picture prediction coding, a picture at the front or the back of the display order is selected as a reference picture. Next, motion vectors are estimated from the reference picture, and the amount of information is compressed by removing redundancies in spatial direction relative to a difference value between a picture, on which motion compensation is performed, and a picture targeted for coding.
In motion picture encoding methods such as MPEG, a picture which performs intra-picture prediction coding using only a picture to be coded without a reference picture is called an I picture. Here, a picture represents an encoding basis which includes both a frame and a field. A picture which is intra-frame prediction coded by referencing a frame that has already been coded is called a P picture, and a picture which is inter-frame prediction coded by referencing two frames that have already been coded is called a B picture.
FIG. 1 is a schematic diagram which shows the prediction relationships for each picture in a motion picture coding scheme. In FIG. 1, a vertical line indicates a picture and, at the bottom right of each picture, a picture type (I, P, B) is shown. The arrows in FIG. 1 indicate that the picture at the tip of the arrow is inter-picture prediction coded using the picture at the rear of the arrow as a reference picture. For example, the second B picture from the top of the order is encoded using the I picture at the top of the order and the fourth P picture from the top of the order as reference pictures.
FIG. 2 is a diagram which shows the display sequence and the coding sequence of the pictures.
As shown in (a) of FIG. 2, the display sequence for the pictures is P picture P1, B picture B2, B picture B3, P picture P4, B picture B5, B picture B6, P picture P7, B picture B8, B picture B9 and P picture P10. On the other hand, the coding sequence for these pictures is B picture B1, P picture P4, B picture B2, B picture B3, P picture P7, B picture B5, B picture B6, P picture P10, B picture B B8 and B picture B9.
Note that in comparison to conventional moving picture coding schemes such as MPEG-2, in the latest version of the moving picture coding scheme H.264, a B picture may reference more than three B pictures.
FIG. 3 is a diagram which shows the reference relationships in H.264.
As shown in FIG. 3, a B pictures references, for instance, a P picture that is two pictures ahead and a P picture that is one picture behind the B picture in the display order. Next, the motion vectors in the B picture are estimated. In this way, for H.264, the number of reference pictures which can be referenced for estimating motion vectors relative to the same B picture is larger than MPEG2 (see for example Non-Patent Document 1, ITU-T Recommendation H.264(03/2005): “Advanced video coding for generic audiovisual services”, ITU-T and Non-Patent Document 2, H.264 and MPEG-4 Video Compression “Video Coding for Next generation Multimedia” WILEY.
However, motion vector estimation in inter-picture prediction coding is performed on a block basis. For every block included in the picture targeted for coding, a block with an image nearest the image in the block is searched for among the reference pictures. Thus, in motion vector estimation, the motion vector estimation search range is set in advance in consideration of the normal computation load and the accuracy of the motion vector. Thus, a case where a target moves along with the lapsing of time is anticipated and is compared to the distance between the picture to be coded and the reference picture, and the search range for the motion vector must be expanded.
FIG. 4 is a diagram which shows search ranges for motion vector estimation.
For example, when the distance (inter-picture distance) between the picture to be coded and the reference picture is 1, and the motion vector search range is ±S×±S, once the inter-picture distance becomes d, the search range for the motion estimation search range becomes (dx±S)×(dx±S). In this way, the search range expands by d×d times. In other words, when the inter-picture distance lengthens, the search range expands in proportion to the rate of change in the distance squared.
As shown in FIG. 4, when the motion vectors MV1, MV2 and MV3 for the block included in the (n+3)th picture to be coded are estimated using an nth reference picture, an (n+1)th reference picture or an (n+2)th reference picture, the search range of the (n+1)th reference picture is 2×2 times the search range of the (n+2)th reference picture and the search range for the nth reference picture is 3×3 times the search range of the (n+2)th reference picture.
Thus, when the inter-picture distance increases, the search range for the motion vectors expands sharply and the amount of computations becomes enormous. Thus, in order to decrease the amount of computations, motion vector estimation apparatuses have been proposed which estimate motion vectors by telescopic searches (see for example, Patent Document 1, Japanese Patent Publication No. 2830183, Patent Document 2, Japanese Patent Publication No. 3335137 and Patent Document 3, Japanese Laid-Open Patent No. H10-341440 Publication).
A telescopic search is a method for estimating motion vectors by successively searching for motion in a picture that is between a reference picture and a picture to be coded. In this method, even if the inter-picture distance lengthens, the search range expands proportionally by the rate of change in the distance, not by the ratio of the change in the distance squared.
FIG. 5 is a diagram which shows a telescopic search.
For example, the motion vector for a block (target block) included in the (n+3)th picture to be coded is estimated using the nth reference picture. In this case, for the telescopic search, a motion vector v1 is estimated relative to the (n+2)th picture from the (n+3)th picture using a search range (±S×±S) centered on the same position as the target block in the (n+2)th picture. Next, a motion vector v2+v1 is estimated relative to pictures from the (n+3)th picture to the (n+1)th picture using a search range (±S×±S) centered on the position indicated by the motion vector v1 in the (n+1)th picture. In the same way, the motion vector v3+v2+v1 is estimated from the (n+3)th picture to the nth picture, using the search range (±S×±S) centered on a position indicated by the motion vectors v2+v1 in the nth picture. This motion vector is the motion vector v0 for the target block. In other words, the overall search range becomes 3×(±S×±S).
However, there is the problem that when the motion vector estimation apparatuses in Patent Document 1 through 3 perform a telescopic search, the amount of computations is large.
In other words, in a telescopic search, the search range in each picture is fixed by performing successive motion searches on pictures between a reference picture and a picture to be coded without regard to the respective picture distances. However, in a telescopic search, motion searches are performed successively for pictures between a reference picture and a picture to be coded, and therefore since the reference pictures as well as unrelated pictures in between must be referenced, the number of references will be proportional to the inter-picture distance. Accordingly, the overall search range increases proportionally to the inter-picture range and as a result, the amount of computations increases when estimating motion vectors.
Further, there is the problem that circuit size increases in a motion vector estimation apparatus that performs the telescopic search. In other words, for the motion vector estimation apparatus, the amount of data in the search range for estimating a motion vector, which is read out from the memory storing the reference picture, is large. As a result, for the motion vector estimation apparatus, circuit size increases due to accelerating the memory transfer operation clock and expanding the memory bandwidth (bit width).
FIG. 6 is a diagram which shows changes in the search range in a reference picture. A search range TA1 in a reference picture RP1 for a target block (commonly, a macroblock, for instance, a block composed of 16 pixels×16 lines) is a range centered on a block TB1 which is co-located with a target block in the reference picture RP1. Thus, the reference picture RP1 is composed from horizontal H pixels and vertical V lines, and the search range TA1 is composed of horizontal h pixels and vertical v lines. Accordingly, when estimating a motion vector for the target block, data for the horizontal h pixels and the vertical v lines are read out of the memory.
Next, when the target block shifts in a horizontal direction, the search range TA2 in the reference picture RP1 for the target block becomes a range centered on a block TB2 co-located with the target block in the reference picture RP1. Accordingly, when the target block shifts in a horizontal direction, the search range also shifts by a block width w. In this case, the data read out of the memory when estimating the motion vector for the target block is made up only of data from the search range TA2 that is not included in the search range TA1. In other words, the data newly read out of the memory is data made up of 16 pixels and vertical v lines.
In this way, when the target block shifts in a horizontal direction, only data that is not included in the previous search range is read out of the memory, not all of the data included in the new search range.
For a telescopic search, data in the search range is read out of the memory for pictures which are mutually adjacent as above. However, in this case, the search range for every picture does not always shift by just the width of the block, the search range may shift widely. Thus, even if data in the search range is read out of the memory, motion vector estimation is performed and the data is accumulated for estimating the next motion vector, most of the data in the new search range must be read out of the memory since there is little overlapping information. Accordingly, in a telescopic search, the amount of data transferred from the memory increases. As a result, power consumption rises.