The present invention relates to motion-vector detecting devices, motion-vector detecting methods, and computer programs. More specifically, the present invention relates to a motion-vector detecting device, a motion-vector detecting method, and a computer program for detecting a motion vector from moving-picture data.
In accordance with the recent improvement in the functions of information processing apparatuses and communication terminals, development of high-speed communication infrastructures, and spread of high-density recording media such as DVDs and Blu-ray discs, distribution of moving-picture data via networks, storage and playback of moving-picture data using high-density recording media, and so forth are in active practice. In accordance with this situation, improvement in the efficiency or speed of data processing of moving-picture data, such as encoding, is desired.
For example, in motion-compensated image encoding in high-efficiency encoding of moving-picture data, detection of moving objects or detection of speed by a vision sensor of a traffic monitoring system or an autonomous traveling vehicle, detection of a direction and magnitude (velocity) of movement of each object included in image data, i.e., detection of a motion vector, is needed.
For example, MPEG (Moving Picture Coding Experts Group) encoding, which is an international standard of high-efficiency encoding of moving pictures, has been proposed as an example of motion-compensated image encoding. The MPEG encoding executes encoding by a combination of DCT (Discrete Cosine Transform) and motion-compensated predictive coding. In motion-compensated predictive coding, correlation of image-signal levels between successive frames of moving-picture data, i.e., a current frame and an immediately previous frame, is detected, motion vectors are detected on the basis of the correlation detected, and the moving picture is corrected on the basis of the motion vectors detected, so that efficient encoding is achieved.
As a method of detecting a motion vector, block matching is known. An overview of block matching will be described with reference to FIG. 1. Temporally successive frame images of a moving picture, e.g., a current frame [Ft] 20 at time (t) and a previous frame [Ft−1] 10 at time (t−1) in the figure, are extracted. One screen of the frame image is divided into small regions (hereinafter referred to as blocks) of m pixels×n lines composed of a plurality of pixels.
Using the current frame [Ft] 20 as a reference frame, a check block By 21 of the reference frame is moved within a certain search area 22, and a check block having a minimum pixel-value difference, i.e., having a highest degree of matching (highest correlation) of pixel value, with a reference block Bx 11 of the previous frame [Ft−1] 10 is detected. It is estimated that the reference block Bx 11 of the previous frame [Ft−1] 10 has moved to the position of the highly correlated check block detected from the current frame [Ft] 20. On the basis of a vector representing the estimated motion, a motion vector for each pixel is calculated. As described above, in block matching, a motion vector is determined by checking correlation (matching) between frames on a basis of individual blocks having a predetermined size (m×n).
In block matching, motion vectors are determined on a block-by-block basis. As an evaluation value representing correlation of each block, i.e., a degree of matching, for example, a sum of absolute values of frame differences, obtained by calculating frame differences by subtracting values between a plurality of pixels in the reference block Bx and a plurality of pixels at spatially corresponding positions in the check block By and accumulating absolute values of the frame differences calculated, is used. Alternatively, for example, a sum of squares of frame differences may be used.
In the block matching described above, however, since complete searching is executed to compare all the data in the search area, disadvantageously, the number of times of comparison needed for detection is very large, so that it takes a long time for motion detection.
Furthermore, when a moving portion and a still portion are included in a block, motion detected on a block basis does not accurately corresponding to motion of individual pixels in the block. Although this problem can be alleviated by setting of the block size, for example, when the block size is increased, the amount of calculation increases, and the problem of a plurality of motions in the block is likely to occur. Conversely, when the block size is decreased, since the area for checking of matching becomes smaller, the problem of reduced accuracy of motion detection arises. That is, when block matching is performed, a large number of check bocks similar to the reference block, i.e., check blocks having high correlation with the reference block, is likely to occur. This is because blocks not due to motion are included. This reduces the accuracy of motion detection. For example, when text telop moves horizontally or vertically, the effect of a pattern of repetition is likely to occur. In the case of a text pattern of Chinese characters, when the same character is divided into small portions, the same pattern often occurs. Thus, when a plurality of motions exist in a block, it is difficult to determine correct motion.
The applicant of this patent application has proposed a motion-vector detecting method and detecting device in which motion vectors can be detected for individual pixels and incorrect detection is prevented without increasing the amount of calculation, for example in Patent Document 1.
The point of the motion-vector detecting process disclosed in Patent Document 1 is that instead of calculating an evaluation value and determining a motion vector for each pixel or each block, as a first step of processing, a plurality of blocks each composed of a plurality of pixels are set in one frame and representative points of the individual blocks are set, correlation between each of the representative points and each pixel in a search area set in another frame is checked, and evaluation values based on correlation information are calculated to generate an evaluation-value table as correlation information based on evaluation values, and a plurality of candidate vectors are extracted from the evaluation-value table. Then, as a second step of processing, a presumably optimal candidate vector is selected from the extracted candidate vectors and associated with each pixel, thereby determining the candidate vector as a motion vector for each pixel. As described above, motion vectors for individual pixels are determined by:
Generating an evaluation-value table;
Selecting candidate vectors on the basis of the evaluation-value table; and
Associating a candidate vector selected from a plurality of candidate vectors as a motion vector for each pixel
This method will hereinafter be referred to as a candidate-vector method.
An advantage of the motion-vector detecting process by the candidate-vector method is that the amount of calculation can be reduced by extracting a limited number of candidate vectors on the basis of an evaluation-value table. Another advantage is that it is possible to determine an optimal motion vector for each pixel from candidate vectors selected in advance even in a region of a boundary of objects, where incorrect detection of a motion vector is likely to occur. Hitherto, it has been the case to determine a motion vector for each pixel by a complete search, i.e., by calculating, for example, a pixel difference between frames as an evaluation value and calculating evaluation values for all the pixels in the frame. In the candidate-vector method, it is possible to determine an optimal motion vector for each pixel from candidate vectors selected in advance. Thus, compared with the complete searching, the possibility of occurrence of the same evaluation value is lower, so that incorrect detection is prevented.
However, in the processing for generating an evaluation-value table, representative points of individual blocks are set, correlation between each of the representative points and each pixel in a search area set in another frame is checked to calculate evaluation values based on correlation information, and the evaluation values are accumulated.
For example, when the absolute value of the difference between a representative-point pixel X and an input pixel Y included in a search area is less than a certain threshold TH, the evaluation value is set as an accumulated evaluation value. That is, when the following expression is satisfied:|X−Y|<TH +1 is counted at the relevant position of the evaluation-value table, and results of calculation of all the representative points in the screen are summed into the evaluation-value table, whereby an evaluation-value table is generated.
In generating the evaluation-value table, correlation is checked only on the basis of the luminance level of the representative point and the luminance levels of input pixels in the search area. Thus, when an evaluation-value table for motion-vector detection is generated using the current frame 31 and the previous frame 30 shown in FIG. 2, a highly correlated pixel corresponding to a representative point 38 in the previous frame 30, i.e., a pixel having substantially the same luminance level, is searched for within the search area 32 set in the current frame 31, and the pixel is accumulated and counted in the evaluation-value table.
In a graph shown on the right side of FIG. 2, pixel levels on one line in an X direction, passing through the representative point 38 in the previous frame 30, and pixel levels on one line in the X direction in the search area 32 of the current frame are shown.
When the search area 38 is searched for pixels having high correlation, i.e., pixels having similar pixel levels, with the pixel level=100 of the representative point 38 of the previous frame, three pixels 35, 36, and 37 are detected. These three pixels all satisfy the condition:|X−Y|<TH so that the pixels are set as accumulated points in the evaluation-value table. Actually, however, only the pixel 36 is associated with a correct motion vector among the three pixels 35, 36, and 37, so that the other two pixels 35 and 37 are incorrectly added as accumulated points in the evaluation-value table.
As described above, in the method of generating an evaluation-value table that has hitherto been used, accumulation based on incorrect information could occur, so that it is not allowed to determine that candidate vectors represented by peaks in an evaluation-value table are all correct. Problems in the evaluation-value-table generating process that has hitherto been used can be summarized as follows:
(a) In the method of counting +1 only on the basis of correlation with representative points detected, the frequency of the evaluation-value table depends on the area of an object in an image. Thus, it is difficult to detect motion vectors of a plurality of objects existing in the screen from the evaluation-value table.
(b) Since the magnitude of a peak in the evaluation-value table depends on the area of an object, the peak corresponding to a candidate vector of an object having a small area but apparent in the image, such as a telop, is small, so that it is difficult to read the candidate vector.
Furthermore, when finally determining a motion vector associated with each pixel on the basis of candidate vectors, block matching is performed. In block matching, pixels neighboring a subject pixel in a previous frame are set as a block, and correlation of a plurality of pixels included in the block is detected as a whole. In order to determine motion vectors correctly through block matching, it is needed to increase the block size so that correlation is checked correctly. When the block size is increased, the amount of calculation of evaluation values for calculating correlation, such as calculation of the sum of absolute values of differences, increases. Thus, the efficiency is reduced, and the size of a memory for holding pixel values must be increased, causing the problem of increased hardware scale.
[Patent Document 1] Japanese Unexamined Patent Application Publication No. 2001-61152