In auto-focus (AF) control of a video camera or the like, a so-called TV-AF system is the most common way of generating an AF evaluation value indicative of the sharpness (contrast state) of video signals generated by an image sensor and searching for the position of a focus lens at which the AF evaluation value is at its maximum. However, in the case of photographing a person, for example, there has been a problem of failing to focus on the person as a main object, and focusing instead on the background in such a case that the background includes objects of high special frequency.
For the purpose of solving such a problem, image sensing apparatuses with a face recognition function are known. For example, there are proposed an image sensing apparatus (for example, Japanese Patent Laid-Open No. 2006-227080) in which a focus detection area for detecting a focus state is set in an area including a recognized face area to carry out focus detection in the set focus detection area, and an image sensing apparatus (for example, Japanese Patent Laid-Open No. 2001-215403) in which an eye of a person is detected to carry out focus detection for the eye.
However, in focus detection with the use of the face recognition function described above, the face may not be stably recognized depending, for example, on the effect of variations in the features of the face in the case of a person looking away, a person with his/her eye(s) closed, or the like, or on the effect of camera shake. Therefore, the stability of focusing may be reduced. Further, in a case in which the size of an object image varies or the size of an object image is small, the stability of focusing may be reduced. In particular, in the case of moving images, the effect is found to be significant since it is highly likely that a person is always moving. If the face of a person is always recognized, focusing on the person is more stable when focusing is carried out with the face area of the person specified as the area of focus detection. However, in a situation in which the face may or may not be recognized, the area for focus detection will be changed depending on whether the face is recognized or not, the AF evaluation value will vary, and stable focusing is thus unable to be carried out. Further, when the area from which an AF evaluation value is taken varies depending on variations in the size of an object image, or when the area from which an AF evaluation value is taken is too small in the case of a small object image, AF evaluation value signals are unable to be obtained stably, and it may be thus difficult to carry out focusing.
Further, in the case of the main object being not a person, similar problems will also occur in a case in which a focus detection area is set based on the results of detecting an object to carry out focus detection.