As a technique of obtaining an image with a deep depth of field which is a distance in a depth direction, at which an object is in focus, a technique of generating an image with a deep depth of field by capturing a plurality of images at different focus positions and selecting, from the plurality of images, images in focus at each position of the images to composite them has been known. When photographing while changing a focus position, a view angle changes according to the change in the focus position and corresponding positions between the plurality of images deviate from each other in some cases. Moreover, since photographing times of the plurality of images are different, deviation which results from a movement of a position of an image capturing apparatus or a movement of an object during photography is caused. Thus, in order to generate an image having no deviation, it is necessary to correct the above-described deviation and match the corresponding positions of the plurality of images with each other and then select images in focus.
As a technique of correcting deviation for composition, PTL 1 describes a technique that one of a plurality of images each of which has a different focus position is selected as a base image, a feature point is extracted from the base image, points corresponding to the feature point of the base image are searched for from the images other than the base image, and the images other than the base image are corrected so that the corresponding points of the images other than the base image match the feature point of the base image. Moreover, PTL 2 describes a technique that, in order to improve accuracy of positional alignment of a plurality of images, a correction amount is calculated by using points, whose positions do not move between the plurality of images, among extracted feature points.