1. Field of the Invention
The present invention relates to an image pickup apparatus that is capable of detecting a motion vector, a method of controlling the same, and a storage medium, and more particularly to an image pickup apparatus capable of detecting a motion vector, which is equipped with a corresponding point-searching technique for searching for corresponding points or motion vectors between two or more images, a method of controlling the same, and a storage medium.
2. Description of the Related Art
In corresponding point search and motion vector search between moving image frames, template matching (TM) is valued as an effective method. Further, image pickup apparatuses in recent years are required to be improved in vector search performance (searchability ratio, outlier ratio, and accuracy) as the number of pixels increases and the performance is improved. The searchability ratio is a ratio of possibility of only one peak being found on a correlation value map, the outlier ratio is a ratio of a large error in a motion vector output by TM as a correct value, and the accuracy is an index of deviation, which is not higher in degree of deviation than the outlier, from a true value.
In general TM, a search range, a template size, various determination threshold values, and a ratio of size reduction of an input image are basic design values which influence the performance. On the design, first, the search range is determined by a magnitude of a motion of an object between images. In recent years, there has been proposed a method of reducing the search range to thereby not only reduce the amount of calculation for the search, but also improve the search performance (see e.g. Japanese Patent Laid-Open Publication No. 2012-160886).
However, the above-mentioned approach is effective only in limited cases, such as a case where prediction of a motion can be performed, and a case where an auxiliary sensing unit, such as a posture sensor, can be used. In most cases, this approach cannot be necessarily introduced.
The search range is required to be increased to cope with a large shake. However, if the search range is increased, this causes a problem that a plurality of positions each having a peak of a correlation value are likely to be generated. To reduce the number of peaks, it is necessary to increase the size of the template to make peak detection difficult to be influenced by a repeated pattern, a flat portion, etc.
Further, the various determination threshold values include threshold values for determining the searchability ratio and the outlier ratio, but the threshold values depend on an object image, and hence it is difficult to handle the values on the design as desired with a view to improving the performance.
On the other hand, by performing the motion vector search after the size of the template is increased relative to the size of an image through reduction of the size of the image, it is possible to reduce the search load. However, as the ratio of size reduction is made larger (as the image after size reduction is smaller), detailed information included in the original image is lost, and hence the search accuracy is lowered. For this reason, in a case where an original input image includes sufficiently detailed information, higher accuracy can be obtained by performing the motion vector search using an input image formed by reducing the original input image by a lower ratio of size reduction.
However, if the motion vector search is performed using a large template, and further, using an input image formed by a low ratio of size reduction, this not only increases the amount of calculation, but also brings about negative effects, such as increase in memory occupancy ratio, oppression of a transport bus band, and increased power consumption. Therefore, the search using an input image formed by a high ratio of size reduction is desired if possible. In this case, the image pickup apparatus is required to cope with camera shake and photographing performed while walking, using e.g. an anti-shake function. In recent years, when considering, as a whole, various cases where an increasing variety of applications use a motion vector, there are few problems even in the motion vector search using an input image reduced by a high ratio of size reduction. This is because as the ratio of size reduction is lower, generally, higher accuracy can be obtained, but an object as a search target changes in its appearance due to influences of a parallax generated by parallel movement of a camera or an object, a motion of a non-rigid body, such as a human body, rolling shutter distortion, and distortion aberration of an optical system, whereby detailed information included in the input image is changed. Therefore, even when the search is performed using an input image formed by a lower ratio of size reduction and also using a large template, sufficient accuracy cannot be necessarily obtained, which reduces a merit of the search using an input image formed by a lower ratio of size reduction. Particularly, parallax conflict caused by influences of a plurality of parallaxes generated within the template has a large influence on reduction of the search performance. However, taking into account each of the cases where the applications use a motion vector, there is room for the improvement of performance. A parallax has a small influence on the search performance in cases where a camera shake having a small shake angle is caused and photographing of an object at a long distance is performed, which have been targeted by the conventional anti-shake technique. Therefore, it is possible to improve the accuracy by using an input image lower in ratio of size reduction, such as an unreduced image which is not reduced in size.
Further, as a counter-measure for solving a problem caused by parallax conflict, there has been proposed a hierarchical layer search process for performing the search using a relatively large template first, and making the template as small as possible for a final search (see e.g. Japanese Patent Laid-Open Publication No. 2011-164905).
However, it is difficult to employ this process because of increased difficulty in real-time processing due to complexity thereof for constructing hierarchical images and the like, too large a circuit scale, and too high a memory occupancy ratio. It is necessary to make a compromise depending on restrictions on the system, and for example, by reducing beforehand a lowest layer-image lowest in ratio of size reduction, which is used as a base image. If the hierarchical layer search is performed using a reduced final base image, a single-layer vector search (single layer search) using an unreduced input image may be higher in vector search accuracy than the hierarchical layer search, depending on photographing conditions. Further, in a case where there are substantially no parallaxes generated only by a camera shake, the complicated function of the hierarchical layer search performed assuming that a parallax is generated do not necessarily correctly function with a large number of threshold settings, and the hierarchical layer search is lower in performance rather than the simple single layer search even if there is no difference in ratio of size reduction of the base image.
As described above, although the hierarchical layer search has been proposed as a method effective in improving the performance of the vector search, advantageous effects thereof are not always obtained in various uses thereof for a camera, ranging from a use thereof accompanied by a camera shake to a use thereof for photographing while walking, and it also has a problem of negative effects, such as oppression of a transport bus band, and increased power consumption. Further, particularly in a specific photographing state, such as a state where a camera shake occurs, higher performance is sometimes obtained by performing the vector search using the simple single layer search.