The technology disclosed in the present specification relates to an image processing apparatus and an image processing method processing an image captured by a camera, an image capture apparatus and a computer program, and, in particular, relates to an image processing apparatus and an image processing method detecting an object, such as a face image of a person included in an image, an image capture apparatus and a computer program.
In an environment in which people live, there are various products which are targets controlled by a user, such as domestic appliances or information devices. A gesture operation is one example of a method in which a user remotely operates this type of device.
For example, a remote operation apparatus capturing an image of an operator operating an operation display unit displaying an operation button or a menu, and detecting an operation by an operator based on the shape of a hand region and movement detected from the captured image, and display of the operation display unit has been proposed (for example, refer to Japanese Unexamined Patent Application Publication No. 2010-79332). According to the related art, it is possible to recognize a gesture of a user using the contours of a finger.
In addition, an image recognition apparatus has been proposed in which an operation of an operator is 3-dimensionally read with respect to a virtual operation surface, whether or not a movement is an operation is determined based on a positional relationship between a portion of an operator and a virtual operation surface, and if the motion of the operator is performed in any region of two or more virtual operation strata determined based on the positional relationship with the virtual operation surface, the content of the operation is determined based on the operation classification allocated in advance to the virtual operation stratum and the motion of an operator in the virtual operation stratum (for example, refer to Japanese Unexamined Patent Application Publication No. 2010-15553).
For a gesture operation, it is basic to analyze a gesture by recognizing a face or a hand of a user from an image of a user captured by a camera. Accordingly, it is considered to be possible to introduce gesture operations to various domestic appliances or information devices to which a camera is mounted.
A face recognition system, for example, is configured of two processes: a face recognition processes detecting the position of a face image and extracting the position as a detected face, and a face recognition process performing recognition of the detected face (specified as a person). Among the above processes, the face detection processes is a process in which a template image of a face or the like is scanned on an input image, and a detected face is extracted by pattern matching is in general use (for example, refer to Japanese Patent No. 4389956).
Incidentally, since if the face image in the input image is rotated with the optical axis of the camera as a center, pattern matching is not appropriately performed on the posture of template image, there is a problem in that the face detection accuracy is lowered. For example, in a case in which a camera is mounted to a device performing gesture operations and the device main body is supported by a rotating mechanism (for example, refer to Japanese Unexamined Patent Application Publication No. 11-24577, Japanese Unexamined Patent Application Publication No. 2003-157016 and Japanese Unexamined Patent Application Publication No. 2011-17738), it is assumed that a face image captured with an internal camera rotates.
Alternatively, also in a case in which the device to which the camera is mounted is a hand-held type, it is assumed that the posture of the camera changes according to the motion of the arm of the user, and that the subject image captured rotates.
For example, a face image detection apparatus has been proposed, which performs face detection by tracking changes in a face image using a plurality of reduced rotated image data items in which input image data is reduced and rotated to a plurality of angles (for example, refer to Japanese Unexamined Patent Application Publication No. 2008-287704). However, when performing pattern matching with a template image with respect to a plurality (n) of reduced rotated image data items, the processing amount is increased by n times.
In addition, when the inclination of a portable telephone is detected using an acceleration sensor (incline angle detection unit) and a face detection process is performed from a captured image, a portable electronic device with an attached camera performing face detection by a subject image for face detection being rotated based on detected incline information of the portable telephone or by changing the order in which pixels are read out has been proposed (for example, refer to Japanese Unexamined Patent Application Publication No. 2009-105559). However, even though the efficiency of the detection process itself increases, the processes in which the subject image is rotated increase. In addition, if pattern matching is not performed on a template image with respect to a subject image after rotation processing only, in a case in which the incline of the face of a subject does not match the incline information of the portable telephone, lowering of the detection precision is a concern.