1. Technical Field of the Invention
The present invention relates to a motion vector detection apparatus to be used in an apparatus for encoding dynamic images. In particular, the present invention relates to a motion vector detection apparatus for detecting a motion vector representing the movement direction and movement quantity of a dynamic image signal between dynamic image frames.
2. Description of the Prior Art
With development of computers and communication networks, digital processing of video signals is rapidly spread. For example, dynamic images are distributed to homes by using a communication satellite, and utilization of the digital image technique using the digital movie, DVD (digital versatile disc), or the like is spread rapidly.
Implementation of faster communication networks for transmitting image data greatly contributes to development of such a technique. Together therewith, advancement of an image compression technique such as the MPEG (moving picture experts group: ISO) cannot be overlooked. For example, in the MPEG, there is detected a motion vector indicating a position in the next frame to which an image included in a certain frame has moved, in order to compress a dynamic image between frames. As for an image part which does not move in successive frames, a previously detected image is used as it is. As for an image part having a motion between frames, such signal processing as to shift the image part by using a detected motion vector is conducted to increase the compression factor of the image. Under such circumstances, a technique for detecting motion vectors is attracting attention.
FIG. 12 shows an example of a relation between a frame and a macro block used in motion vector detection. In FIG. 12, a rectangle surrounded by an outer frame represents a frame 11 displayed on the screen at a certain time. In this example, the frame 11 is formed of 352 pixels by 240 lines. The frame 11 is divided into macro blocks 12 each formed of a rectangular region having 16 pixels by 16 pixels. In this case, the frame 11 is formed of 22xc3x9715 macro blocks 12. Each macro block 12 becomes the unit of motion vector detection.
The reason why a region of some pixels is thus defined as the macro block 12 is that a region of such a degree that it can be recognized as a pattern of a certain degree is needed in order to judge the motion of the image within the framework. If the size of the macro block 12 is expanded more than needed, then there occur problems such as a problem that a region which is stationary between frames cannot be pick out efficiently. Furthermore, as the size of the macro block is made smaller, the quantity of information for comparison decreases and the data quantity of whole processing for detecting a motion vector increases. Under such circumstances, the macro block size becomes in many cases equal to approximately the above described pixel size.
FIG. 13 is a diagram showing the concept of the motion vector. It is now assumed that there are a first frame 111 and a second frame 112 which is temporally later than the first frame 111, on a time axis (t). It is assumed that a dynamic image corresponding to an arbitrary macro block 121 shaded in the first frame 111 is judged to have moved from the same position 14 in the frame to a different position 15 in the second frame 112 obtained after the elapse of a predetermined time. In this case, a motion vector 16 can be represented as a connection of the position 14 before the movement to the position 15 after the movement in the same frame 112.
FIG. 14 shows the principle of detecting such a motion vector. In the same way as the description given by referring to FIG. 13, it is now assumed that there are a first frame 111 and a second frame 112 which is temporally later than the first frame 111. The second frame 112 is a picture which is now being subjected to processing for detecting a motion vector. The first frame 111 is a picture for which processing has already been finished. As for a macro block 122 in the position 15 of the second frame 112 for which a motion vector is to be detected, retrieval is effected to find a position of the first frame 111 having a macro block which resembles the macro block 122 most closely and thereby the motion vector is detected.
At this time, a concept referred to as a retrieval range 21 is introduced in order to reduce the burden of the search processing. Only in the retrieval range 21, a macro block of a dynamic image corresponding to the macro block 122 is detected. This detection operation is conducted by scanning the image pattern in the same range as that of the macro block 122 as in TV raster scan, beginning from the left top corner and successively in the retrieval range 21. A scan position which has caused the highest coincidence becomes a start point of the motion vector. For the purpose of calculation processing for judging the start point of the motion vector, a concept called vector evaluation value is introduced in some cases.
FIG. 15 is a diagram showing the concept of the vector evaluation value so as to correspond to FIG. 14. In the retrieval range 21 shown in FIG. 14, an inspection framework 311 having the same size as that of the macro block 122 is disposed. And absolute values of differences respectively between 16 by 16 pixels included in the inspection framework 311 (see a right bottom part of FIG. 12) and 16 by 16 pixels forming the macro block 122 are derived, and they are added together. A resultant sum xcexa3E1 is adopted as the vector evaluation value for the inspection framework 311. The reason why absolute values are derived is that the difference of signal levels of pixels is to be found. If the inspection framework 311 completely corresponds to the macro block 122, then ideally the vector evaluation value xcexa3E1 becomes xe2x80x9c0.xe2x80x9d
If the vector evaluation value xcexa3E1 between the inspection framework 311 and the macro block 122 is thus derived, then an inspection framework 312 obtained by moving the inspection framework 311 in a direction of an arrow 32 shown in FIG. 15 by one pixel is disposed in the same way. Between the inspection framework 312 and the macro block 122, absolute values of differences respectively of 16 by 16 pixels are derived. The absolute values are added. A resultant vector evaluation value is xcexa3E2. Then an inspection framework 313 obtained by further moving the inspection framework 312 in a direction of the arrow 32 by one pixel is disposed in the same way. Between the inspection framework 313 and the macro block 122, absolute values of differences respectively of 16 by 16 pixels are derived. The absolute values are added. A resultant vector evaluation value is xcexa3E3. Hereafter, all regions in the retrieval range 21 as described with reference to FIG. 14 are thus scanned in the same way. Respective vector evaluation values xcexa3E are thus derived. It is understood that such a position of the framework 31 in the retrieval range 21 that the sum of absolute values becomes minimum is a position which has become the start point of movement of the macro block 122. By thus calculating the vector evaluation value in every position in the retrieval range 21, comparing them with each other, and finding an inspection framework 31 having a minimum value, a motion vector in the position of the macro block 122 can be derived.
With respect to each position of macro blocks 12 of the current frame (the second frame 112), a vector evaluation value having a minimum value between the macro block 12 and a past frame to be referred to (the first frame 111), in the retrieval range 21 is thus derived. By doing so, a motion vector in each position of the macro block 12 of the current frame ((the second frame 112) can be derived. For that purpose, however, massive calculation needs to be conducted between frames. This is evident even if the consideration is limited to the macro block 122 in the position 15 shown in FIG. 14. In other words, while successively shifting the inspection framework 31 shown in FIG. 15 by one pixel in the horizontal direction, beginning from the top left end, work for deriving the vector evaluation value xcexa3E each time is effected. In addition, the inspection framework 31 is moved to a position which is shifted from the top left end downward by one pixel. While successively shifting the inspection framework 31 by one pixel in the horizontal direction, the vector evaluation value xcexa3E is derived each time in the same way. Such calculation processing is repeated meanderingly by the number of pixels in the vertical direction forming the inspection framework 31.
Even if the communication rate of the transmission path is increased or the image recording rate is increased in order to make the dynamic image communication possible, therefore, a very long time is required for image processing, resulting in a problem of occurrence of a fault in real time processing.
Therefore, various measures for reducing the burden of the motion vector processing have heretofore been proposed and put to practical use.
For example, in JP 8-32969 A (1996), there is provided retrieval range control means for adaptively selecting controlling the size of the motion vector retrieval range 21 in a frame to be referred to (the first frame 111) corresponding to the macro block to be processed. The retrieval range control means contracts the retrieval range 21 as the correlation of the motion vector position becomes higher. For an image region which is small in motion change, the retrieval range control means makes the retrieval region smaller. Thus the retrieval range control means shortens the processing time for detecting the motion vector. For an image region which is large in motion change, the retrieval range control means expands the retrieval range 21 and conducts accurate motion detection. In this way, accurate motion vector retrieval and faster processing speed are implemented efficiently.
Furthermore, in JP 10-191352 A (1998), it is attempted to improve the processing in two points. A first point is to make the processing speed faster. In the technique shown in the publication, computation of each vector evaluation value for motion vector detection as described with reference to FIG. 15 is conducted by deriving accumulation results in parallel by using accumulation addition circuits connected in parallel in a pipeline form.
A second point is to reduce the burden of the above described computation processing of the vector evaluation value. In the technique shown in JP 10-191352 A (1998), accumulation addition cease means is used in processing for deriving an accumulation addition value of absolute values of respective differences between pixels forming the macro blocks 12 in respective positions of the current frame (the second frame 112) and pixels of the inspection framework 31 of the past frame to be referred to (the first frame 111). When, in the process of deriving the accumulation addition value, the value has exceeded a predetermined threshold value, the accumulation addition cease means stops the accumulation and addition operation. By thus suspending useless computation processing in such an inspection framework 31 as not to become the detection subject of the motion vector, the accumulation addition cease means attempts to decrease the computer processing. The accumulation addition cease means will now be described more concretely.
FIG. 16 shows the principle of the motion vector detection processing using the accumulation addition cease means of this proposal. Processing similar to this is disclosed in JP 10-136373 A (1998) as well. First, in retrieving the position serving as the start point of the motion vector in the retrieval range 21 shown in FIG. 14, the minimum value of the accumulation addition value is initialized (step S41). The start point of the motion vector must be the minimum value of the accumulation addition value serving as the vector evaluation value. Since the retrieval is in its initial stage, however, a maximum value that the final accumulation addition value can take is set as the initial value for initialization. Subsequently, as the initial value of the inspection framework 31 (see FIG. 15), for example, the top left corner in the retrieval range 21 is set (step S42). The inspection framework 31 is set in this position (step S43). While deriving absolute values of differences between the inspection framework 31 and a macro block serving as a tip point of the motion vector (macro block 122 in the position 15 in FIG. 14) by taking a pixel as the unit, they are successively added (step S44). At the time of this addition, it is determined whether the current accumulation value thus obtained has become greater than a minimum accumulation value (step S45). If the minimum accumulation value is not reached (N) and the calculation of the vector evaluation value is not yet finished (N of step S46), then the processing returns to the step S44, and the accumulation operation of the vector evaluation value further advances.
On the other hand, if the value is smaller than the minimum accumulation value even when the calculation of the vector evaluation value with respect to the inspection framework 31 has been finished (Y of step S46), then the minimum accumulation value set until now is replaced with the value of the vector evaluation value which has now been finished (step S47). If calculation of the vector evaluation value in all positions in the retrieval range 21 is not finished (N of step S48), then a position in which the inspection framework 31 should be set subsequently is set as a xe2x80x9cpredetermined positionxe2x80x9d for calculation (step S49). The processing returns to the step S43, and the work for deriving the vector evaluation value with respect to the next inspection framework 31 is started.
While thus moving the position of the inspection framework 31 in the retrieval range 21, the vector evaluation value is successively calculated. If the accumulation value of the vector evaluation value with respect to the inspection framework 31 during this process exceeds the minimum accumulation value at the time, then the accumulation computation is finished at that time point, and the position of the inspection framework 31 is moved to the next position. If the calculation of the vector evaluation value in all positions in the retrieval range 21 is finally finished (Y of step S48), then the position of the inspection framework 31 having a vector evaluation value corresponding to the current minimum accumulation value is set as the start point of the motion vector (step S50).
Heretofore, proposals for improving the processing for calculating the motion vector have been described. Even if these proposals are adopted, there is no change in that the burden of the encoding processing is considerably heavy. For example, it is now assumed that the processing shown in FIG. 16 is executed and the minimum accumulation value finally becomes a value R. Until this value is reached, however, it is necessary to continue the computation of the accumulation value with respect to individual inspection frameworks 31 even for values greater than R. Furthermore, even if the value R is found in a comparatively early stage, it is necessary to move the inspection framework 31 to all positions in the retrieval range 21 by taking a pixel as the unit and continue the computation of the accumulation value for each of them until the value R is reached.
If it is attempted to execute an image compression technique such as MPEG in a simple processing apparatus such as a personal computer or a small sized information processing terminal in real time, therefore, the case where the encoding processing rate per frame exceeds the frame interval still occurs. This results in a problem that smoothly moving dynamic image communication and dynamic image recording are hampered in many cases.
Therefore, an object of the present invention is to provide a motion vector detection apparatus capable of drastically shortening the processing time required for detection of the motion vector.
In the present invention, a vector evaluation value is derived while successively moving an inspection framework within a retrieval range, when detecting a motion vector. In this process, a fixed detection threshold value is set in a macro block included in the frame under processing and already finished in motion vector detection processing, on the basis of its vector evaluation value. If a vector evaluation value less than the fixed detection threshold value emerges in comparison, then a motion vector is judged to have been detected in that stage. As a result, subsequent retrieval processing for the macro block for which a motion vector is to be detected becomes unnecessary, resulting in faster processing. When the macro block specifying means has specified one macro block, vector evaluation values might be already derived for some macro blocks among a predetermined number of macro blocks located near the macro block in the same frame under processing and having specific position relations to the macro block. In this case, the fixed detection threshold value is set on the basis of those values. The reason why values of the motion vector of macro blocks located in the vicinity are referred to is that the macro blocks and the macro block for which the motion vector is to be derived typically have very intense correlation in motion of the dynamic image portion.
Further, in the present invention, the fixed detection threshold value setting means sets a fixed detection threshold value out of vector evaluation values of a predetermined number of macro blocks having specific position relations. Thereby, the fixed detection threshold value setting means makes the condition for judging the motion vector strict, and raises the reliability of the motion vector in the case where a vector evaluation value less than the fixed detection threshold value is obtained.
Further, in the present invention, it is taken into consideration that the motion vector detection apparatus finishes the processing when a vector evaluation value which is at most the fixed detection threshold value, and a contrivance is made in order to make the vector evaluation value which is at most the fixed detection threshold value as soon as possible. When setting a retrieval range for a macro block which is included in the frame under processing, the prediction means predicts a motion of a dynamic image portion by referring to a motion vector of a frame already finished in processing and included in an image portion corresponding to the macro block, for example by referring to a past motion vector according to the processing method. The retrieval range setting means sets a retrieval range in order that the position predicted by the prediction means may become its central position. As a result, calculation of the vector evaluation value can be conducted in a range having high likelihood and in an order having high likelihood.
Further, in the present invention, coarse motion vector retrieval processing is first conducted and tentative motion vector detection is conducted in a predetermined retrieval range. On the basis of the result, the range for conducting fine motion vector retrieval processing is narrowed. With respect to the narrowed range, fine motion vector retrieval is conducted. Thereby efficiency of the retrieval is increased. In addition, a technical thought of retrieval discontinuance using the fixed detection threshold value is applied to such a retrieval technique in order to shorten the processing time.
Further, in the present invention, the retrieval range is not merely scanned simply as in the raster scan of TV. The inspection framework is moved from a central portion having high likelihood to a peripheral part so as to be able to detect a vector evaluation value which is at most the fixed detection threshold value as soon as possible. The processing time is thus shortened.
Further, in the present invention, if a value better than expected is obtained as a vector evaluation value in the process of the coarse motion vector retrieval scanning, then a motion vector can be set on the basis of the position without conducting fine motion vector retrieval scanning, in order to shorten the processing time. As a matter of course, in such a situation that time shortening is demanded very severely, it is also possible to conduct only the coarse motion vector retrieval scanning and detect the motion vector from only the result.
According to the present invention, a vector evaluation value is derived while successively moving an inspection framework within a retrieval range, when detecting a motion vector. In this process, a fixed detection threshold value is set in a macro block included in the frame under processing and already finished in motion vector detection processing, on the basis of its vector evaluation value. If a vector evaluation value less than the fixed detection threshold value emerges in comparison, then a motion vector is judged to have been detected in that stage. As a result, subsequent retrieval processing for the macro block for which a motion vector is to be detected becomes unnecessary, resulting in faster processing.
Further, according to the present invention, the fixed detection threshold value setting means sets a fixed detection threshold value out of vector evaluation values of a predetermined number of macro blocks having specific position relations. By utilizing the correlation to motion vectors of macro blocks in the neighborhood, therefore, detection of a motion vector having high likelihood can be made possible.
Further, according to the present invention, when setting a retrieval range for a macro block which is included in the frame under processing and for which the motion vector detection means is to derive a motion vector, the retrieval range setting means predicts a position of existence of a pixel portion corresponding to the macro block in the reference frame, by referring to a motion vector of a macro block having the same position as the above described macro block in a frame already finished in processing, and sets a retrieval range in order that the position may become its central position. Even if the retrieval range is a relatively narrow range, therefore, a motion vector can be favorably detected. Furthermore, since the central position of the retrieval range is set to a predicted position, the motion vector can be detected at high speed.
Further, according to the present invention, the inspection framework moving means includes first inspection framework moving means for moving an inspection framework coarsely in the retrieval range and second inspection framework moving means for moving an inspection framework finely in the retrieval range. Rough detection of the motion vector can be conducted by using the first inspection framework moving means. In such a case that tendency of the motion vector is hard to judge, the invention is effective. Furthermore, since the retrieval range of the second inspection framework moving means can be limited, it is possible to reconcile the detection precision and the processing time. Furthermore, since the invention of claim 4 applies the invention of claim 1, the processing time can be shortened for both the detection of the coarse motion vector and the fine motion vector. As a result, total processing time can be shortened.
Further, according to the present invention, the inspection framework moving means successively moves an inspection framework from a central portion of the retrieval range to a peripheral part. In such a general case that a central portion has high likelihood, therefore, it becomes possible to find a vector evaluation value which is at most the fixed detection threshold value earlier. In the case where such a technique that the processing is finished at the time of finding is adopted, shortening of the processing time can be implemented.
Further, according to the present invention, when a vector evaluation value of a motion vector judged by the motion vector detection decision means in the motion vector detection apparatus by using the first motion vector detection means is less than the fixed detection threshold value, the judged motion vector is set as a fixed motion vector without conducting final detection of a motion vector using the second motion vector detection means. After conducting only the process of the coarse retrieval scanning, detection of the motion vector can be finished, contributing to greatly the processing time shortening.