The current invention relates to the field of digital detection for skin color pixels in a digital image, and in particular to the field of hair-skin color pixel separation in a digital image.
Skin color detection is adopted as a preliminary step in automatic redeye detection and correction of consumer images for real-time use (see commonly assigned U.S. Ser. No. 08/919,560 filed Aug. 29, 1997 entitled xe2x80x9cA Computer Program Product for Redeye Detectionxe2x80x9d). In this redeye detection application, skin color areas are first located and subsequent steps are employed to determine if the red dots within the skin color areas are true red eyes. The success of redeye detection largely depends on the success of having clean face skin color regions identified. Previous experiences tell that it is particularly difficult to obtain clean face regions in color images with the presence of blond hairs.
The algorithm designed in the aforementioned redeye detection application is used for locating and correcting redeye defects in consumer images without user intervention such as ROI (region-of-interest) selection. The major goal of the redeye detection algorithm is to detect redeye defects with a minimum number of false positives and within a minimum execution time without sacrificing performance. For this reason, face-region (with most of the hair regions eliminated) detection is performed so that unnecessary processing is not performed on red-dot candidates that are not located in detected face regions. The easiest and fastest way for face-region localization is the use of skin-color pixel detection that requires only pixel operations. It has been shown that difficulties arise when dealing with images having faces associated with blond hairs. In these cases, the skin-color detection process fails to produce satisfactory or desired results that would assist in the redeye detection procedure. FIG. 1 displays an example picture 500 that causes problems. In FIG. 1, objects 504, 505, 508, 509, and 510 are non-skin objects; 501 and 506 are blond hairs; 502, 503, and 507 are skin regions. In FIG. 2, picture 600 shows the example result of conventional skin detection algorithms. Clearly, the skin-color detection process does not separate the hairs from the face (see object 601) and therefore the subsequent redeye detection process will take the whole head plus the body as the face region and no redeye defects will be corrected.
There have been many publications recently addressing skin color detection for face recognition in color image processing, but only a few of them concern the issue of hair-face skin pixel identification. For instance, in Wu et al. (H. Wu, Q. Chen, and M. Yachida, xe2x80x9cFace Detection From Color Images Using a Fuzzy Pattern Matching Method,xe2x80x9d IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 21, No. 6, pp. 557-563, 1999), a hair model is used to assist face detection. The RGB color information in an image is first converted to CIE""s XYZ color representation through a linear transformation resulting in one luminance component Y and two chromaticity components x=X/(X+Y+Z) and y=Y/(X+Y+Z). Then the two chromaticity components, x and y, are furthered converted to another space through a non-linear transformation, resulting in two new color components, u and v, that are perceptually uniformly-distributed in the new color space. The hair model is a function of three variables: the luminance Y and the chromaticities, u and v. Noticeably, this hair model works mainly for Asian faces with dark hairs. Moreover, the conversion from RGB to the corresponding CIE tristimulus values requires the knowledge of the color calibration that varies from imaging device to device.
There are a number of color spaces currently used by researchers for color image processing as described below.
Psychological spacexe2x80x94Some researchers believe that the RGB basis is not a particularly good one to explain the perception of colors. Alternatively, a transformed non-RGB space, or a psychological space, is well accepted to describe xe2x80x98colorsxe2x80x99. It is compatible to human color perception. This non-RGB space is composed of three components, hue (H), saturation (S) and brightness value (V). Instead of using three values (R,G,B) to distinguish color objects, a single component, H, is used to label a color pixel in this transformed space.
CIELab space-CIE in 1976 recommended the CIELab formula for color measurement. It is designed that colors in the CIELab space are perceptually more uniformly spread than are colors in RGB and psychological (e.g. HSV) spaces. Therefore, using the CIELab space enables the use of a fixed color distance in decision making over a wide range of colors.
Lst spacexe2x80x94The Lst space is traditionally called T-space in graphic applications and is constructed with log exposures. L is the luminance component, and s and t are the two chrominance components. It is shown that the statistical distribution of color signals is more symmetrical in Lst space than in linear space.
YCRCB spacexe2x80x94The YCRCB space is designed for television broadcasting signal conversion. Y is the luminance component, CR and CB are two chrominance components. Researchers working in video images prefer using this space.
Generalized R-G-B Space (gRGB)xe2x80x94This is also called normalized R-G-B space. This space is transformed from the native R-G-B space by normalizing each of the three elements of the original R-G-B by the summation of the three original elements. The resultant three new elements are linearly dependent so that only two elements are needed to effectively form a new space that is collapsed from three dimensions to two dimensions. So, it is also called a collapsed R-G-B space in some articles. This space does not provide an explicit luminance component like the other spaces. This generalization process reduces the illuminant effects on chromaticity components.
So far, there are no conclusive data showing that any one of the above color spaces is overwhelmingly superior to any others in terms of skin-color detection performance. The skin-color detector performance, rather, mostly depends on the structure of the detector itself. The reality is that space transformation from RGB to another color domain does not change the skin-pixel and non-skin-pixel distribution overlap in the original RGB space. This skin-pixel and non-skin-pixel distribution overlap is the major cause of FP (false positive) in skin-color detection. The selection of color space largely depends on designers"" preference, practical use effectiveness, and model complexity.
Table 1 illustrates the computation expense for different transformation operations. Among them, gRGB transformation has the lowest computation expense. This is one of the advantages that gRGB transformation provides.
What is therefore needed is a way to provide an efficient skin-color detection method with which desirable clean face regions can be obtained with very low computation complexity prior to further image processing for applications such as the aforementioned redeye detection in the presence of blond hairs. The present invention describes a mechanism that overcomes the difficulty of separating blond hair pixels from skin color pixels by fusing two color detection strategies that work in generalized RGB (gRGB) space and the hue space respectively.
It is an objective of the present invention to automatically detect the skin color region in a digital color image.
It is a further object of the present invention to produce a clean skin color image in the presence of blond hairs.
The present invention is directed to overcoming one or more of the problems set forth above. Briefly summarized, according to one aspect of the present invention, a method and a computer program product for generating a clean skin color image comprises the steps of:
(a) converting an input RGB color image from the native RGB space to a preferred color space, the generalized RGB (gRGB) space, resulting in a GRGB color image;
(b) selecting two of the generalized RGB color components (generalized R and G in this case) to form a working space for skin color detection;
(c) classifying the pixels of the converted input image, the gRGB color image, into skin color and non-skin color pixels;
(d) eliminating non-skin color pixels and forming a binary skin color mask;
(e) masking the gRGB color image obtained in step (a) to form a masked gRGB image;
(f) converting the masked gRGB image to a hue image;
(g) separating blond hair color pixels and the skin color pixels in the hue space using the hue image;
(h) removing the blond hair color pixels in the hue image and forming a new skin color binary mask;
(i) masking the original RGB color image to form a masked color image that retains skin color pixels only.
These and other aspects, objects, features and advantages of the present invention will be more clearly understood and appreciated from a review of the following detailed description of the preferred embodiments and appended claims, and by reference to the accompanying drawings.
The present invention is effective in obtaining a clean skin color image in the presence of blond hair color pixels and has the advantages of:
(1) performing skin detection in the dimension-reduced space;
(2) conducting skin detection recfinement in one-dimension space;
(3) fusing two color classification strategies performed in two stages (skin detection and skin detection refinement);
(4) effectively using intermediate results of the first stage in the second stage;
(5) providing imaging applications such as redeye detection with cleaner skin color images.