1. Field of the Invention
The present invention relates to a motion estimation method and an apparatus for calculating a motion vector.
2. Description of the Prior Art
In the recent years, information transmitting media such as news paper, TV and radio have been flooded with information relative to "multimedia" to which ardent attentions are paid by not only those skilled in the art but ordinary people in the world. Although variously interpreted, the term "multimedia" as used herein is considered to be information presented in the combination of text, graphics, video, sound and the like. Since such information is generally handled by a computer, data representative of the video and sound as well as the text and graphics are required to be digitized. When the data representative of video sequences such as moving pictures are digitized, the amount of digitized data is extremely large in comparison with that of the data indicative of sound, text or graphics. For this reason, the data of moving pictures to be handled by the computer are required to be compressed when the data are stored in a storage device or transmitted over a communication line.
Up until now, there have been proposed a wide variety of data compression processes for compressing the data of moving pictures. The data compression process is applied to a basic inter-frame predicting coding method, a motion compensation inter-frame predicting coding method, a bi-directional predicting coding method, a dual-prime predicting coding method, and other predicting coding methods. The following description will be made about the basic inter-frame predicting coding method, the motion compensation inter-frame predicting coding method, and the bi-directional predicting coding method with reference to FIGS. 97 to 103.
FIGS. 97(a) and 97(b) respectively show two types of moving pictures different in predicting coding process to each other and each constructed by a series of pictures. Here, the term "picture" is intended to mean a frame or a field each forming part of the moving picture. The frame comprises a first field of odd scanning lines forming part of the frame and a second field of even scanning lines forming part of the frame. The symbols "I", "P" and "B" shown in FIGS. 97(a) and 97(b) represent "Intra-picture" (hereinlater referred to as "I-picture" for simplicity), "Predictive-picture" (also hereinlater referred to as "P-picture" for simplicity) and "Bidirectionally predictive-picture" (similarly, hereinlater referred to as "B-picture"), respectively. The I-picture is encoded from its original picture. The P-picture is encoded either from the I-picture or the P-picture in the same order as that of the original picture. The B-picture is encoded after the I-picture and P-picture are processed, and is then placed between the I-picture and P-picture. The symbol "M" represents a cycle which is updated every time an I-picture or a P-picture appears, and the symbol "Fd" represents a distance between the reference picture and the picture to be encoded.
First, the basic inter-frame predicting coding method will be described with reference to FIG. 98. This method comprises a step of calculating a difference between pel values of each picture element (hereinlater referred to merely as "pel") of a current picture 12 and pel value of each pel of a reference picture 11 corresponding in position to each pel of the current picture 12, the current picture 12 and the reference picture 11 partially forming a moving picture. The method further comprises steps of comparing the difference with a predetermined threshold value, and dividing the pel value of the reference picture 11 into two data groups consisting of a significant pel value group having differences each larger than the threshold value and an insignificant pel value group having differences each equal to or less than the threshold value. The significant pel value are considered to be useful data that are not allowed to be omitted when the current picture 12 is estimated on the basis of the reference picture 11. On the contrary, the insignificant pel value are considered to be unnecessary data that are allowed to be omitted when the current picture 12 is estimated on the basis of the reference picture 11. The reference picture 11 may be either of future or past pictures with respect to the current picture 12.
If a person image 10 in the reference picture 11 has been moved right in the current picture 12 as shown in FIG. 98, there are produced two significant pel values regions indicated by the reference numerals of 13 and 14, respectively and an insignificant pel value region indicated by a blank surrounding the significant pel value regions 13 and 14. By adding, to pel value of a pel of the reference picture 11 within the significant pel value regions 13 and 14, a difference between pel value of a pel of current picture 12 and the pel value of the pel of the reference picture 11 corresponding in position to each other, pel value of the pel of the current block picture 12 can be estimated. Pel value of each pel of the current picture 12 within the insignificant pel value region are represented by pel value of a pel of the reference picture 11 corresponding in position to the pel of the current picture 12.
In the case that the basic inter-frame predicting coding method is utilized, the difference data between two pels rapidly decrease as the significant pel value is decreased. This means that compression efficiency can be enhanced. The number of the significant pel is decreased by setting the threshold value large and as a consequence the compression efficiency can be further enhanced. It however, the threshold value becomes extremely large, motion of the image looks to be jerky, or moving portion of the image looks to be at a standstill in part, thereby resulting in an drawback of the fact that image quality becomes poor.
In view of the property of the basic inter-frame predicting coding method, the compression efficiency is enhanced under the condition that variation between the current picture and the reference picture is small because of the fact that the difference data are decreased in proportion to the size of standstill image region of the current picture with respect to the reference picture. The following motion compensation inter-frame predicting coding method, however, realizes higher compression efficiency in comparison with the basic inter-frame predicting coding method.
Likewise, on the assumption that the person image 10 in the reference picture 11 is moved right in the current picture 12, the motion compensation inter-frame predicting coding method is explained hereinafter with reference to FIG. 99. The motion compensation inter-frame predicting coding method comprises a step of calculating a motion vector "MV" indicating the movement distance and movement direction of the person image 10 between the reference picture 11 and the current picture 12. The motion compensation inter-frame predicting coding method further comprises a step of estimating the person image 10 in the current picture 12 with the aid of the motion vector MV and pel value defining the person image 10 in the reference picture 11. In this case, there is produced only one significant pel value region 13 as shown in FIG. 99. Accordingly, the motion compensation inter-frame predicting coding method is superior to the basic inter-frame predicting coding method in the fact that the number of the significant pels can be sharply decreased and accordingly that the compression efficiency can be extremely enhanced.
The motion compensation inter-frame predicting coding method will be described hereinafter in detail with reference to FIGS. 100 to 102. According to ITU-T (International telecommunication Union--Telecommunication Standardization Sector) H.261, the motion compensation inter-frame predicting coding method comprises steps of dividing a current picture 20 shown in FIG. 100 into a plurality of blocks including a block (hereinlater referred to as a "current block") 21, specifying a search window 31 including blocks (referred to hereinlater as "candidate blocks") in a reference picture 30, and calculating distortion values each indicative a difference between the current block 21 and each of the candidate blocks. The distortion value is calculated by converting, into positive numbers, local distortion values each indicative of a difference between pel value of each pel of the current block 21 and pel value of each pel of the candidate block corresponding in position to each pel of the current block 21, and summing up the converted local distortion values.
The motion compensation inter-frame predicting coding method further comprises steps of specifying a candidate block 32 which provides a minimum distortion value, i.e., the smallest in the distortion values calculated in the above mentioned manner, and calculating a motion vector representative of a distance between and a direction defined by the current block 21 and the candidate block 32. The motion vector MV thus calculated and the distortion value between the candidate block 32 included in the reference picture 30 and the current block 21 are encoded by an encoder (not shown).
FIGS. 101(a) and 101(b) represent relations among the current block 21, search window 31 and candidate blocks 32. If the current block 21 and the search window 31 contain N columns of M pels and H columns of L pels as shown in FIGS. 101(b) and 101(a), respectively, the search window 31 includes (H-N+1).times.(L-M+1) candidate blocks 32 each similar to the current blocks 21. In the case that pel value of a pel at the top left-hand corner of current block 21 in FIG. 101(b) is indicated by a(0,0), pel value of each of the candidate blocks 32 corresponding in position to the pel value a(0,0) of the current block 21 are included in an area defined by oblique lines in FIG. 101(a).
FIGS. 102(a) and 102(b) represent a positional relation between pel values of each pel of the current block 21 and pel value of each pel of the candidate block 32 corresponding in position to each pel of the current block 21. The pel value b(l+m,h+n) in FIG. 102(a) indicates pel value of each of the candidate blocks corresponding in position to the pel value a(m,n) of the current block 21 shown in FIG. 102(b). Pel value b(l,h) in the search window 31 shown in FIG. 102(a) is representative of a block position arranged at the upper left-hand corner of the candidate block 32 and accordingly corresponds in position to the pel value a(0,0) of the current block 21.
Under the state that the current block 21, the search window 31 and the candidate block 32 are shown in FIGS. 101(a), 101(b), 102(a) and 102(b), a distortion value D(l,h) between the current block 21 and the candidate block 32 is indicated as follows: ##EQU1## Note that ".parallel. .parallel." is indicative of a norm, and "d(m,n)" represents a local distortion value indicative of a difference between pel values of two pels corresponding in position to each other. The norm arithmetic is absolute-value arithmetic, square arithmetic or the like. The local distortion value is defined by the following equation. EQU d(m,n)=b(l+m,h+n)-a(m,n) (2)
The above-mentioned process of comparing a block of the current picture with each of blocks of the reference picture in motion compensation inter-frame predicting coding method is called a block matching method. The above-mentioned process is particularly called a full search block matching method if the current block is compared with all the candidate blocks included in the search window.
Such the full search block matching method has been known by Japanese Patent Laid-open Publication No. 2-213291. In this method, the search window itself is moved upward, downward, and leftward with respect to the search block to scan the whole pel values in the search window, thereby saving calculation time required for calculating the local distortion values of the difference between the pel values of two pels corresponding in position to each other in the search window. During a cycle 1 shown in FIG. 103(a), after each of the processor elements receives pel value in the search window, .vertline.b(l,h)-a(0,0).vertline. is calculated (where l=0, 1, 2, and h=0, 1, 2) at each of the processor elements. During the next cycle 2, the whole pel values in the search window is moved upward to calculate .vertline.b(1,h+1)-a(1,0).vertline. as shown in FIG. 103(b). During the next cycle 3 shown in FIG. 103(c), the whole pel values in the search window is moved leftward to calculate .vertline.b(1+1,h+1)-a(1,1).vertline. at each of the processor elements. During the next cycle 4, the whole pel values in the search window is moved downward to calculate .vertline.b(1+1,h)-a(0,1).vertline. as shown in FIG. 103(d).
The block matching method described above is, however, required to be carried out through two different cycles consisting of a first cycle for transmitting the whole pel values in the search window upward and a second cycle for transmitting the whole pel values in the search window downward, and thus requires upward and downward busses each of which needs to be connected at each of the processor elements, therefore making it more complex and difficult to design various circuits to calculate the distortion values. Moreover, the size of the search window cannot be changed when the moving distance of a moving picture is extremely large, or when the time-lag is excessively long between the play-back pictures of the reference picture and the current picture whose motion vector is calculated, by the reason that the size of the search window is determined by the number of processor elements.