1. Field of the Invention
The present invention relates to a device and a method for face image extraction, and a recording medium having recorded a program for carrying out the method. More specifically, in image processing, such device and method are used to extract, at high speed, a face region from a target image utilizing a template to define position and size thereof.
2. Description of the Background Art
As everyone acknowledges, a human face often mirrors his/her thinking and feeling, and thus is considered a significant factor. In image processing especially where handling human images, if such human face can be automatically detected and processed to reveal its position and size in a target image, such system comes in useful. Here, the target image includes still pictures and moving pictures, and a person taken therein may be both real and artificial created by computer graphics, for example. This is the reason for the recent attempt in image processing to extract a face region out of any target image on such system.
Conventional technologies of such face image extraction have been disclosed in Japanese Patent Laid-Open Publication No. 9-73544 (97-73544) (hereinafter, first document) and No. 10-307923 (98-307923) (hereinafter, second document), for example.
The technology disclosed in the first document is of finding an approximation of face region by an ellipse. Therein, the ellipse is defined by five parameters including center coordinates (x, y), a radius r, a ratio b between major and minor axes, and an angle xcex8 between the major axis and an x axis. These parameters are changed as appropriate to be optimal in value for face image extraction.
In the second document, the technology is of successively finding face parts (e.g., eyes, nose, mouth).
In the first document, however, approximation requires repeated calculation to change those parameters (especially the angle xcex8 takes time). In consideration of a face image hardly staying the same, real-time approximation is hopeless with the processing capability of existing personal computers, so thus is real-time face image extraction processing. Also in this technology, there has no concern given for a possibility that one image may include several human faces, and thus applicability of this technology is considered narrow.
In the second document, the technology is not available unless otherwise a face region has been defined by position in an image. Therefore, this is applicable only to a specific image, resulting in narrow applicability.
Therefore, an object of the present invention is to provide a broadly-applicable device and method for defining a face by position and size in images varied in type for face image extraction at high speed, and a recording medium having recorded a program for carrying out the method.
The present invention has the following features to attain the object above.
A first aspect of the present invention is directed to a face image extraction device for defining a face in a target image by position and size for extraction, comprising:
an edge extraction part for extracting an edge part (pixels outlining a person or face) from the target image, and generating an image having only the edge part (hereinafter, edge image);
a template storage part for storing a template composed of a plurality of predetermined concentric shapes equal in shape but varied in size;
a voting result storage part for storing, in a interrelating manner, voting values and coordinates of pixels on the edge image for every size of the concentric shapes of the template;
a voting part for increasing or decreasing the voting values of every pixel, specified by the coordinates, outlining each of the concentric shapes every time a center point of the template moves on the pixels in the edge part; and
an analysis part for defining the face in the target image by position and size based on the voting values stored in the voting result storage part.
As described above, in the first aspect, the face position can be detected at high speed only with light-loaded voting processing and evaluation of voting values. Further, as is utilizing a template composed of concentric shapes varied in size, approximation can be done in a practical manner by comparing, in size, an edge part presumed to include a face region with the template. Accordingly, size of the face can be detected also at high speed. As such, in the face image extraction device of the present invention, processing load can be considerably reduced, thereby achieving almost real-time face region extraction even with the processing capabilities available for the existing personal computers. Further, in the first aspect, a face region does not have to be defined where and how many in a target image prior to extraction, and thus a face can be detected no matter what size and type the target image is. Accordingly, applicability of the device considered quite wide.
Herein, preferably, the predetermined concentric shape is a circle, an ellipse, or a polygon. In such case, the circle may improve the voting result in accuracy as is being constant in distance from a center point to each pixel outlining the circle.
Preferably, the edge extraction part extracts the edge part from the target image by using a filter for a high frequency component.
Therefore, any high frequency component can be obtained by using a filter for the target image, whereby position and size of a face can be preferably detected in a case where the target image is a still picture.
Preferably, when the target image is structured by a plurality of successive images, the edge extraction part extracts the edge part by comparing a current image with another image temporally before, and with after to calculate a difference therebetween, respectively, for every image structuring the target image.
In this manner, a current target image is compared with another temporally before and then with after to calculate a difference therebetween, respectively. Accordingly, position and size of a face can be preferably detected in a case where the target image is a series of moving pictures. Further, with the help of a template for detection, a face region can be stably extracted at high-speed even if facial expression changes to a greater extent at zoom-in or close-up, for example.
Also preferably, the edge extraction part detects, with respect to pixels extracted in every predetermined box, one pixel located far-left end or far-right end in the box on a scanning line basis, and regards only the pixels detected thereby as the edge part.
In this manner, any part differed in texture within contour is prevented from being extracted as the edge part. Therefore, the extraction processing can be done, at high speed, with respect to the face region.
Also preferably, the analysis part performs clustering with respect to the voting values stored in each of the voting result storage parts, and narrows down position and size of the face in the target image.
Therefore, even in the case that a target image includes several faces, the face region can be extracted by clustering the voting results (each voting value) and then correctly evaluating correlation thereamong.
Also preferably, the face image extraction device further comprises an image editing part for editing the target image in a predetermined manner by distinguishing a face region defined by position and size in the analysis part from the rest in the target image.
As such, by editing the target image while distinguishing a face region defined by position and size from the rest, only a desired part, i.e., a face, can be emphasized and thus become conspicuous in the target image As an example, the target image excluding the face region may be solidly shaded, leading to eye-catching effects.
Still preferably, the face image extraction device further comprises an image editing part for replacing an image of the face region defined by position and size by the analysis part with another.
As such, the image of the face region can be replaced with another. In this manner, the face can be intentionally concealed. This works effective, for example, when image-monitoring a person who is suffering dementia. In such case, by replacing the image of a face with another, privacy can be protected, and a face area can be defined for monitoring This works also good when replacing images of a person""s movement with other type of character""s.
A second aspect of the present invention is directed to a face image extraction method for defining a face in a target image by position and size for extraction, comprising:
an extraction step of extracting an edge part (pixels outlining a person or face) from the target image, and generating an image having only the edge part (hereinafter, edge image);
a first storage step of storing a template composed of a plurality of predetermined concentric shapes equal in shape but varied in size;
a second storage step of storing, in a interrelating manner, voting values and coordinates of pixels on the edge image for every size of the concentric shapes of the template;
a voting step of increasing or decreasing the voting values of every pixel, specified by the coordinates, outlining each of the concentric shapes every time a center point of the template moves on the pixels in the edge part; and
an analysis step of defining, after the voting step, the face in the target image by position and size based on the voting values.
As described above, in the second aspect, the face position can be detected at high speed only with light-loaded voting processing and evaluation of voting values. Further, as is utilizing a template composed of concentric shapes varied in size, approximation can be done in a practical manner by comparing, in size, an edge part presumed to include a face region with the template. Accordingly, size of the face can be detected also at high speed. As such, in the face image extraction device of the present invention, processing load can be considerably reduced, thereby achieving almost real-time face region extraction even with the processing capabilities available for the existing personal computers. Further, in the second aspect, a face region does not have to be defined where and how many in a target image prior to extraction, and thus a face can be detected no matter what size and type the target image is. Accordingly, applicability of the device considered quite wide.
Herein, preferably, the predetermined concentric shape is a circle, an ellipse, or a polygon.
In such case, the circle may improve the voting result in accuracy as is being constant in distance from a center point to each pixel outlining the circle.
Also preferably, in the extraction step, the edge part is extracted from the target image by using a filter for a high frequency component.
Accordingly, a high frequency component is extracted from the target image by using a filter. Therefore, position and size of a face can be preferably detected in a case where the target image is a still picture.
Also preferably, when the target image is structured by a plurality of successive images, the edge part is extracted by comparing a current image with another image temporally before, and with after to calculate a difference therebetween, respectively, for every image structuring the target image.
In this manner, a current target image is compared with another temporally before and then with after to calculate a difference therebetween, respectively. Accordingly, position and size of a face can be preferably detected in a case where the target image is a series of moving pictures. Further, with the help of a template for detection, a face region can be stably extracted at high-speed even if facial expression changes to a greater extent at zoom-in or close-up, for example.
Also preferably, in the extraction step, with respect to pixels extracted in every predetermined box, one pixel located far-left end or far-right end in the box is detected on a scanning line basis, and only the pixels detected thereby is regarded as the edge part.
As such, any part differed in texture within contour is prevented from being extracted as the edge part. Therefore, the extraction processing can be done, at high speed, with respect to the face region.
Still preferably, in the analysis step, clustering is performed with respect to the voting values stored in each of the voting result storage parts, and position and size of the face is narrowed down in the target image.
As such, even in the case that a target image includes several faces, the face region can be extracted by clustering the voting results (each voting value) and then correctly evaluating correlation thereamong.
These and other objects, features, aspects and advantages of the present invention will become more apparent from the following detailed description of the present invention when taken in conjunction with the accompanying drawings.