The present invention relates to recognition of a specific object from a color image.
As important clues that help extract the facial region of a person from a two-dimensional image, three kinds of information, i.e., motion, color, and shape can be used, and some schemes based on such information have been proposed.
As a facial region recognition scheme using motion, Turk et al. have proposed a scheme for extracting a xe2x80x9cface spacexe2x80x9d defined by eigenvectors obtained by the KL transform of a facial image from an image (Matthew A. Turk and Alex P. Pentland, xe2x80x9cFace Recognition Using Eigenfacesxe2x80x9d, Proc. IEEE Computer Soc. Comf. on Computer Vision and Pattern Recognition, PP. 586-591, 1991). However, with this method, not only the background is contained as a facial region but also the number of persons that can undergo extraction is small. On the other hand, Kimura et al. have proposed a scheme using difference information between flesh tone information and color information of background (Kimura, Kato, and Iguchi, xe2x80x9cTracking of Face Image using Skin Color Informationxe2x80x9d, the Technical Report of the Institute of Electronics, Information and Communication Engineers, HIP96-12, pps. 65-70, 1996). With this scheme, a facial region is stably extracted at high speed. However, these schemes as well as that by Turk et al. are premised on moving image data, and cannot be used in extraction from a still image.
As a facial region extraction scheme using color information in a still image, Dai et al. have proposed a scheme for extracting a facial region by classifying a face pattern and other textures by an SGLD matrix used as an index for expressing texture information of density images (Y. Dai and Y. Nakano, xe2x80x9cFace-texture model based on SGLD and its application in face detection in a color scenexe2x80x9d, Pattern Recognition, vol. 29., no. 6, pp. 1007-1017, 1996). However, this scheme can only cope with full faces captured from ways and requires a large computation volume. On the other hand, Wu et al. have proposed a scheme for extracting a facial region by fuzzy pattern matching by extracting a probable facial region on the basis of distribution models of skin and hair colors in a Farnsworth""s uniform perceptual space (Wu, Chen, and Yachida, xe2x80x9cFace Detection from Color Images by Fuzzy Pattern Matchingxe2x80x9d, the translations of the Institute of Electronics, Information and Communication Engineers D-II, Vol. J80, no, 7, pp. 1774-1785, 1997). Color information of a face is the most important clue upon extracting a facial region at high speed, but cannot solely realize accurate extraction due to large influences of chromatic components of background.
As a scheme using shape information, Hagiwara et al have proposed a scheme for searching for a face edge using an ellipse (Yokoo and Hagiwara, xe2x80x9cHuman Face Detection Method Using Genetic Algorithmxe2x80x9d, the transactions of the Institute of Electrical Engineers of Japan 117-C, 9, pp. 1245-1252, 1997). In this scheme, a plurality of facial regions can be detected, but an ellipse requires five parameters, resulting in a long search of time.
As another scheme using shape information, a scheme that pays attention to the density pattern itself of a facial region is available. Hara et al. have proposed a scheme for face template matching based on a genetic algorithm (Hara and Nagao, xe2x80x9cExtraction of facial regions of arbitrary directions from still images with a genetic algorithmxe2x80x9d, the Technical Report of the Institute of Electronics, Information and Communication Engineers HCS97-12, pp. 37-44, 1997). However, this method is vulnerable to the influences of background, and detection is hard to attain in case of a complex background.
Furthermore, Yang et al., Juell et al., Ito et al., and Lin et al. have respectively proposed schemes for searching for the density pattern of a face by a neural network (Guangsheng Yang and Thomas S. Hung, xe2x80x9cHuman Face Detection in a Complex Backgroundxe2x80x9d, Pattern Recognition, vol. 27, pp. 53-63, 1994; P Juell and R. March, xe2x80x9cA hierarchical neural network for human face detectionxe2x80x9d, Pattern Recognition, vol 29, no. 6, pp. 1017-1027, 1996; Ito, Yamauchi, and Ishii, xe2x80x9cFace Image Extraction from a picture by a detecting features of attentive regions of artificial neural networkxe2x80x9d, the Technical Report of the Institute of Electronics, Information and Communication engineers NC96-200, pp. 347-453, 1997-03; S. H. Lin and S. Y. Kung, xe2x80x9cFace recognition/detection by probabilistic decision-based neural networkxe2x80x9d, IEEE Trans. Neural Networks, vol 8, no. 1, pp. 114-132, 1997). However, these extraction schemes based on shape information allow accurate alignment but require a long detection time resulting from troublesome computations.
The present invention has been made in consideration of the conventional problems and has as its object to provide a method and apparatus for accurately detecting the facial region of a person from a color image at high speed.
It is another object of the present invention to provide an image processing method for extracting a specific object from a color image at high speed.
In order to achieve the above objects, the present invention comprises the following arrangement.
That is, a facial region extraction method for extracting a facial region of a person from a color image, comprises:
the detection step of detecting a flesh tone region;
the generation step of generating a projective distribution of the detected flesh tone region;
the search step of searching the generated projective distribution for a parabola;
the extraction step of extracting a facial region candidate from a position of the parabola found by search; and
the determination step of determining if the extracted facial region candidate is a facial region.
The detection step preferably uses hue and color difference components, and a region having predetermined hue and color difference components is detected as the flesh tone region.
The search step preferably includes the step of searching for a parabola by a genetic algorithm that matches the projective distribution against the parabola while changing a parameter of the parabola.
The extraction step preferably uses a parabola obtained from a projective distribution in only one axis direction of a coordinate system.
The extraction step preferably uses two parabolas obtained from projective distributions in two axis directions of a coordinate system.
The determination step preferably includes the step of determining if the facial region candidate is a facial region using an aspect ratio of the facial region candidate.
The determination step preferably includes the step of determining if the facial region candidate is a facial region using a neural network that has learned facial patterns in advance.
The neural network preferably has learned face data in a plurality of directions as teacher images.
The determination step preferably includes the step of determining if the facial region candidate is a facial region based on the presence/absence of a hair region in the facial region candidate.
An image processing method for extracting a specific object from a color image, comprises:
the detection step of detecting a specific color region indicating the specific object from a color image; and
the recognition step of performing pattern recognition using a genetic algorithm with respect to the detected specific region.
Preferably, the detection step includes the step of detecting a region having a specific shape that indicates the specific object by analyzing the detected specific color region in X- and Y-axis directions, and
the recognition step includes the step of performing pattern recognition using the genetic algorithm with respect to the detected region having the specific shape that indicates the specific object.
The method preferably further comprises the step of determining the number of objects to be extracted, and
when objects corresponding in number to a value designated in advance have been detected, extraction ends.
The method preferably further comprises the step of determining an extraction processing time, and when a time designated in advance has reached, extraction ends.
The method preferably further comprises the extraction region calculation step of calculating a ratio between a total area of detected objects and an area of an image searched, and
when an extraction area ratio has exceeded a value designated in advance, extraction ends.
The recognition step preferably includes the step of replacing a new gene obtained by genetic manipulation by a gene with a highest adaptation level among selected genes when the new gene expresses a position within a region from which an object has already been extracted.
An image processing method for extracting a facial region of a person from a color image, comprises the steps of:
detecting substantially parabolic boundaries with respect to X- and Y-axes, which are formed by an object in a predetermined color, from a color image; and
detecting an elliptic region from detection results, and determining the detected elliptic region as a facial region.
The substantially parabolic boundaries are preferably detected by changing at least some of parameters of a parabola and determining parameters with a highest adaptation level as boundaries of the object.
With this arrangement, since candidate regions are extracted by searching for parabolas defined by a small number of parameters, a quick search can be done, and the time required for extracting a facial region can be shortened. Also, since candidates are found by search based only on the color and projective distribution of regions, and it is checked if each candidate region is a facial region, candidates for a facial region can be narrowed down, and the facial region can be quickly extracted.
Furthermore, upon detecting a specific object from a color image, candidates for the object to be detected are narrowed down based on the color of a region, and pattern recognition is made for the candidates, thus allowing quick recognition of an object.
Moreover, upon detecting a facial region, a parabolic region is extracted in the X- and Y-axis directions, and an elliptic region which is likely to be a facial region is detected from the extracted region. Hence, candidates that undergo detection can be narrowed down, and quick facial region detection is achieved. Since candidates are extracted based on a relatively simple figure, i.e., a parabola, they can be extracted quickly.
Since the user sets in advance a threshold value used to end processing, an end condition suitable for an object image to be found can be set, and a search can be ended without wasting time.
Since a plurality of conditions can be set as the end condition, a search repeats itself until specific conditions are satisfied, thus reliably making processing converge.
Other features and advantages of the present invention will be apparent from the following description taken in conjunction with the accompanying drawings, in which like reference characters designate the same or similar parts throughout the figures thereof.