1. Field of the Invention
The present invention relates to an image processing apparatus, image processing method, and program for detecting a predetermined subject in an image and identifying its attribute.
2. Description of the Related Art
A face detection technique detects a person's face in an image and specifies its position and size (e.g., see P. Viola and M. Jones, “Rapid Object Detection using a Boosted Cascade of Simple Features”, Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Vol. 1, pp. 511-518, December 2001). Such techniques are widely applied to image capturing apparatuses such as digital cameras, printers for photo printing, and the like. Such techniques are used to, for example, determine a preferential target subjected to AE(Automatic Exposure)/AF(Auto Focus) control or smooth and soft skin processing.
Face image-based person recognition techniques determine whose face has been detected (e.g., see A. B. Ashraf, S. Lucey, and T. Chen, “Learning Patch Correspondences for Improved Viewpoint Invariant Face Recognition”, Carnegie Mellon University, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2008). With gradually improved precision, this technique is being installed in products such as a digital camera. Further, a technique of identifying an expression such as whether one smiles or opens his eyes, and a face attribute such as the face direction is under development, and is used for determination of a photo opportunity and the like.
In addition to these object detection/recognition functions regarding a human face, demand has arisen for a function (user's registered subject detection function) of designating an arbitrary subject the user wants as a target object. For example, Panasonic digital camera Lumix DMC FX-550™ meets this demand with a moving subject tracking function in addition to the face detection and person recognition functions (see Panasonic digital camera DMC FX-550™ manuals). The moving subject tracking function tracks an object designated on the touch panel, and sets it as an AE/AF target. Another model of Panasonic digital camera allows the user to designate a subject of his choice and track it by capturing it within a predetermined frame without using the touch panel, and pressing the shutter button halfway. For example, Sharp cell phone SH-06A™ with a camera function also has a similar object tracking function. As a method of easily designating an object the user wants on the touch panel, for example, there is a technique disclosed in Japanese Patent Laid-Open No. 2006-101186.
As a known method of registering a subject the user wants and detecting it from an input image, for example, an image of a predetermined size containing the subject is registered as a template. Then, a corresponding position within each subsequent image capturing frame is quickly detected using a method described in B. Lucas and T. Kanade, “An iterative image registration technique with an application to stereo vision”, Proceedings of Imaging understanding workshop, pp. 121-130. A feature amount called HoG, which is described in N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection”, CVPR, 2005, can be extracted and set as registered data. In this case, in detection processing, an input image is scanned using a subwindow equal in size to the registered object image, and the same HoG feature as the registered data is extracted from an image cut out for each subwindow. Then, a match/mismatch of the extracted feature with the registered data is determined using a discriminator such as SVM (Support Vector Machine). If a match occurs, this means that the registered subject exists at the subwindow position. Also, using a method as described in P. Viola and M. Jones, “Rapid Object Detection using a Boosted Cascade of Simple Features”, Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Vol. 1, pp. 511-518, December 2001, by determining a plurality of object images designated as a detection target by the user as positive data, and by determining the background data which can be held in advance as negative data, a detection parameter can be learned within the device.
When using the foregoing person authentication function, the user needs to register the object image (e.g., face image) of a person of user's choice in advance in the apparatus. Also when using the user's registered subject detection function, the user needs to select and register an object image he wants. That is, the user needs to register a selected object image when using these functions.
However, this registration operation conventionally differs between functions for use. For example, when the user uses the person recognition function of the above-mentioned digital camera, he first selects “person authentication” from the shooting menu. Further, he selects “registration”, and photographs a person's face to be registered in accordance with a guidance displayed at the center of the LCD. Then, he inputs a title code such as a name, completing the registration. Alternatively, when the user takes several pictures of a person's face while setting “auto registration” ON, a screen automatically appears to prompt him to register a frequently photographed person's face. After that, the user can register a person's face by the same procedure. In shooting after registration, the face image of the registered person (e.g., face close to that of the registered person) is detected and preferentially undergoes AE/AF. When using the moving subject tracking function of this camera, the user registers an object by designating the object displayed on the LCD using the touch panel. AE/AF is continuously performed along with the motion of the object. Note that the user can only exclusively use either the moving subject tracking function or person authentication function.
For this reason, when a conventional apparatus includes a plurality of types of subject detection/recognition functions requiring a registration operation to specify a preferential object, the user needs to select and execute the registration operation which differs between the functions, putting a heavy burden on him.