1. Field of the Invention
The present invention relates to image shooting devices such as digital still cameras and digital video cameras.
2. Description of Related Art
In digital still cameras and digital video cameras, processing for detecting a human face from a shot image has been put into practical use, and methods have been proposed for executing camera control and various application programs with the detected face taken as of interest. More than one face may be detected from a shot image, in which case one of those faces is chosen which is to be taken as of interest, and camera control and various application programs are executed with respect to the face of interest thus chosen. Such a face of interest is called a priority face.
For example, in a conventional digital camera furnished with a face detection function, when more than one face is detected from one shot image (still image), a priority face is chosen based on the distances of the faces on the shot image from its center and the sizes of the faces, and automatic focusing control is executed so that focus comes on the priority face.
In the case of a moving image, once a face is chosen as a priority face at a given time, as long as the face continues to be detected, that face needs to be kept recognized as the priority face. Accordingly, in a digital camera that performs camera control etc. with a priority face on a moving image taken as of interest, the priority face is followed up through face identification processing. Such identification processing is generally achieved based on the position information of a face on an image. Identification processing based on position information exploits the following principle: change in the position of a face on an image is continuous in the temporal direction.
Now, with reference to FIGS. 19A and 19B, identification processing based on position information will be described briefly. Consider a case where shooting at a first time point yielded an image 901 as shown in FIG. 19A and then shooting at a second time point yielded an image 902 as shown in FIG. 19B. The length of time between the first and second time points is, for example, equal to the frame period in moving image shooting. Through face detection processing on the images 901 and 902, face regions 911 and 921 are extracted from the image 901, and face regions 912 and 922 are extracted from the image 902. In this case, through identification processing based on position information, the positions of the face regions are compared between the images 901 and 902, and such face regions whose positions are close to each other are judged to contain an identical face, thereby achieving face identification. In the example shown in FIGS. 19A and 19B, the faces contained in the face regions 911 and 912 are judged to be identical, and the faces contained in the face regions 921 and 922 are judged to be identical.
A technology is also known that exploits face recognition technology to follow up a particular face. Inconveniently, however, face recognition technology requires complicated processing for extracting an individual's features from an image of his face to achieve the following-up of a particular face. By contrast, the above-described identification processing based solely on positional information allows the following-up of a particular face through simple processing, and is therefore helpful.
Indeed, a priority face can be followed up in the manner described above; however, for some cause, the face of interest to be grasped as the priority face may temporarily become undetectable. For example, when the face of interest happens to be located behind another subject, and thus out of the camera's view, that is, when the face of interest is shielded, as long as it is shielded, it cannot be detected. Conventional identification processing that can be executed in such a case will now be described with reference to FIGS. 20A, 20B, and 20C.
FIGS. 20A, 20B, and 20C show images 950, 960, and 970 obtained by shooting at first, second, and third time points respectively. It is assumed that the first, second, and third time points occur one after another in the this order. At the first, second, and third time points, two faces FC1 and FC2 are present inside the shooting region of a digital camera. Suppose that, with respect to the faces FC1 and FC2, face regions 951 and 952, respectively, are extracted from the image 950, and that the face FC1 is chosen as the priority face at the first time point. In FIGS. 20A to 20C (and also in FIGS. 21A to 21C described later), faces enclosed in thick solid-lined frames are the priority face.
At the second time point, suppose that, as a result of movement of the subject, the face FC1 is hidden behind the face FC2, and thus the face FC1 is not detected from the image 960 at the second time point. The reference sign 962 represents the face region of the face FC2 extracted from the image 960. The position of the face region 962 is close to the position of the face region of the priority face (e.g., the position of the face region 951) on the shot image obtained immediately before the image 960. In addition, only one face region is detected from the image 960. Consequently, at the second time point, instead of the face FC1, the face FC2 corresponding to the face region 962 is recognized as the priority face.
As a result of movement of the subject thereafter, the faces FC1 and FC2 are detected from the image 970 at the third time point, and the face regions 971 and 972 of the faces FC1 and FC2 are extracted from the image 970. What should be noted here is that, although the face that the user takes as of interest is the face FC1, since the priority face shifted from the face FC1 to the face FC2 at the second time point, through identification processing between the second and third time points, the priority face in the image 970 remains the face FC2.
In the example corresponding to FIGS. 20A to 20C, the hiding of the face FC1 behind the face FC2 makes the face FC1, which is to be taken as the priority face, temporarily undetectable. A similar phenomenon occurs also in a situation where the face FC1 temporarily goes out of the shooting region (field of view) of the camera. A conventional method for choosing a priority face which can cope with such a situation will now be described with reference to FIGS. 21A to 21C.
FIGS. 21A, 21B, and 21C show images 950a, 960a, and 970a obtained by shooting at first, second, and third time points respectively. Suppose that, with respect to faces FC1 and FC2, face regions 951a and 952a, respectively, are extracted from the image 950a, and that the face FC1 is chosen as the priority face at the first time point. As a result of, for example, movement of the face FC1 between the first and second time points, if all or part of the face FC1 goes out of the shooting region of the camera so much that the face FC1 cannot be detected, the face FC1 is not detected from the image 960a. At this time, if the face FC2 is detected from the image 960a and the face region 962a of the face FC2 is extracted, the camera chooses the face FC2 anew as the priority face instead of the face FC1. As a result of, for example, movement of the face FC1 or panning of the camera thereafter, even if the faces FC1 and FC2 are detected from the image 970a at the third time point and their face regions 971a and 972a are extracted, through identification processing between the second and third time points, the priority face in the image 970a remains the face FC2.
In this way, when a face of interest to be grasped as a priority face becomes temporarily undetectable, executing identification processing simply with the same conditions as when the face of interest has been detected may lead to another face—one different from the face of interest—being chosen as the priority face. This makes it impossible to continue camera control or the like with respect to the face of interest.