Exemplary embodiments of the present invention relate to a method and apparatus for effectively extracting a motion vector during a motion search.
Recently, the digitalization and utilization of videos have been rapidly accomplished. Furthermore, TV broadcasting has been digitalized, and domestic appliances such as DVD recorders or digital video cameras have widely spread. Furthermore, the video distribution on the Internet or the utilization of video mail, video phone, or video conference by mobile phones has also been accomplished.
Video coding technology is core technology for video digitalization. According to the advancement of technology represented by H.264 or demand for high resolution represented by HDTV, the computation amount of video coding is increasing day by day. In particular, the increase in computation amount of block matching, which occupies most of the computation amount, is becoming a serious problem.
Furthermore, a software encoder inevitably performs only imperfect block matching. Therefore, as the imperfect block matching becomes a main factor of the degradation of video quality, there is demand for the implementation of a high-efficiency search algorithm capable of reducing the computation amount and improving the accuracy of block matching.
As for the block matching, a variety of search methods have been proposed. However, a video quality of full search may be accomplished in a video in which block matching is easily performed, but image degradation of 1 dB or more may occur in some videos. This may serve as a large factor which degrades the video quality.
For example, since a spiral motion search method has a small computation amount, the spiral motion search method is mainly used as a search method of a software encoder. However, the spiral motion search method may significantly degrade the accuracy depending on videos. In the spiral motion search method, the computation amount is minimized by stopping a search according to a predetermined rule. However, this may have a bad effect. For example, when the search is trapped at a local optimal point, it may not approach an optimal point in a large area.
Furthermore, a multi-step search method has a disadvantage in terms of the reduction in computation amount, but an extreme degradation of accuracy does not occur. Therefore, the step search method is widely used as a search method of a hardware encoder. However, since the accuracy of each motion decreases, the accuracy of video in which most of motions are constant may be degraded in comparison with the spiral search method.
The motion search refers to a process of searching a coincident position within a reference video for each block of a coding target video. As an evaluation reference of coincidence, sum of absolute differences (SAD) is generally used. When it is assumed that a block within a coding target video is represented by B, a candidate vector is represented by v, a pixel value of a pixel existing at a position r within the coding target video is represented by Icur(r), and a pixel value of a pixel existing at a position r within the reference video is represented by Iref(r), the SAD for the candidate vector v in the block B is expressed as Equation (1) below.
                              SAD          ⁡                      (                          B              ,              v                        )                          =                              ∑                          r              ∈              B                                ⁢                                                                                  I                  cur                                ⁡                                  (                  r                  )                                            -                                                I                  ref                                ⁡                                  (                                      r                    +                    v                                    )                                                                                                    Eq        .                                  ⁢        1            
SADs (B,v) are calculated for several vectors, and a vector v at which the SAD (B,v) is minimized is finally decided as a motion vector.
The SAD has the largest efficiency as a comparison reference. However, the computation amount inevitably increases in the full search for evaluating the entire search points in the search range. Therefore, a variety of algorithms for reducing a computation amount, such as 3-step search, 4-step search, and diamond search, have been proposed.
In the spiral motion search method, a search is performed from an arbitrary point within a video toward the surroundings, and stopped at a point where a predetermined condition is satisfied. The spiral motion search method accurately estimates a search start position, which makes it possible to more efficiently perform a search. To decide a search start point, a variety of methods may be applied. For example, a vector calculated during a motion search may be referenced in a previous video, or a vector calculated from surrounding blocks may be referenced in a current block. Furthermore, referenced vectors may be combined. In addition, SAD may be calculated as a search stop rule according to the sequence, and a search may be stopped when the value decreases and then increases. This method searches only a specific region of a video, and may minimize the number of search points. However, when an estimation mistake occurs, an effect of reducing the degradation of video quality or the computation amount may not be sufficiently obtained.
In the multi-step search method, as a sub-sampling video is created for a video for calculating a motion, the resolution is decreased to reduce the number of search target points and the computation amount of block matching. A full search is performed in a video having the lowest resolution, and the search result is used to search a video having a higher resolution. Then, the resolution is gradually increased to finally calculate a motion vector in the original video. Therefore, this method is called a multi-step search method. In this method, since the number of search target points is decreased by reducing the resolution of the sub-sampling video, the computation amount is reduced at a predetermined rate regardless of the characteristic of the video, and the entire video is searched. Therefore, the video quality is not significantly degraded. Furthermore, since the resolution should be significantly reduced with the reduction of the computation amount, the amount of detailed video information decreases to thereby degrade the accuracy. In order to make up for such disadvantages, the following methods are proposed: a method of referring to a surrounding block search result, a method of performing filtering on a sub-sampling video, or a method of supplementing the decrease of detailed video information by expanding a template when calculating a motion vector.
In order to efficiently perform the spiral motion search on each of an original video and a sub-sampling video, the following methods are combined: (1) multi-stage search start point decision (2) double search range setting, and (3) adaptive stop condition decision.
The multi-stage search start point decision is performed as follows: several candidate vectors are set, SADs are calculated between a position indicated by each candidate vector and positions moved from the position by one pixel in the four directions, and a position at which the coincidence is the highest, that is, the SAD is the lowest is set to a search start position. At this time, the number of candidates may be increased to set the search start position to a more optimal position. However, when a large number of candidates are compared at once, the computation amount inevitably increases. Therefore, the overall candidates are not compared at once, but are divided into several groups and then compared.
First, the coincidence of a position indicated by each candidate in a first candidate group is calculated. When the highest coincidence is larger than a threshold value, a position at which the highest coincidence is obtained is set to a search start position. When the highest coincidence is smaller than the threshold value, the coincidence of a position indicated by each candidate in the next candidate group is calculated.
Until a candidate indicating a position where the coincidence is larger than the threshold voltage appears, the coincidence calculation is repeated for the overall candidates.
The group classification of the candidate vectors for deciding on a search start position is performed by sub-sampling video search, original video search with sub-sampling video search, and original video search without sub-sampling video search.
During the sub-sampling video search, candidates obtained by using spatial correlations of a video are divided into three groups. The divided three groups are set as the first, second, and third groups, and a predetermined position of the video is set to a fourth group.
During the original video search, when sub-sampling video search is previously performed, the search result is set to first and second groups, candidates obtained by using spatial correlations are set to a third group, candidates obtained by using temporal correlations are set to a fourth group, and a predetermined position of the video is set to a fifth group. Furthermore, when the sub-sampling video search is not performed, candidates obtained by using spatial correlations are set to a first group, candidates obtained by using temporal correlations are set to a second group, and a predetermined position of the video is set to a third group.
During the candidate extraction of each group, when the X and Y components of a newly-acquired candidate vector have a difference within three pixels from a previously-acquired candidate vector, the candidate is not added to the group.
When a stop condition is not properly set for the characteristic of the video, the search may be stopped before an optimal value is acquired, and thus the video quality may be significantly degraded. Furthermore, although an optimal value is passed, the search may be continued to significantly increase the computation amount. Therefore, a double search range may be set with the search start point as the center. In an internal small search range, the search is performed without a stop decision. Thus, the degradation of the video quality is prevented. Furthermore, in an external large search range, the search is performed with a stop decision. The large search range is set to be smaller than the original search range (full search range). Therefore, although the entire search range is searched without a stop decision, the increase of the computation amount is small. When the video is moved as a whole in a predetermined direction, it is highly likely that the search start position deviates from an estimated position. Therefore, a standard deviation of motion vectors obtained by the motion search is used as an indicator for deciding on a search range.
A stop condition using a threshold value is adopted as the stop condition. In this case, the search is stopped at a time point where a position having an SAD less than the threshold value appears. During the sub-sampling video search, a search is performed for 13 kinds of templates for one block. However, the maximum template (template obtained by combining four blocks) is used for the stop decision.
The threshold value for the search stop decision is decided based on the SAD when motion vectors are decided. The motion vectors serve as the base for candidate vectors used when deciding on the search start position. When the motion of a video is similar, the minimum SAD may be considered as a similar value, and a difference between the maximum SAD and the minimum SAD when the respective candidate vectors are decided is added as an allowance to the SAD when a candidate vector is decided as the search start position.
Although the SAD is not less than the threshold value, the search is stopped when the update of vectors is not performed during two rotations along the spiral.