Detecting human skin tone is used in numerous applications such as video surveillance, face and gesture recognition, human computer interaction, image and video indexing and retrieval, image editing, vehicle drivers' drowsiness detection, controlling users' browsing behaviour (e.g., surfing pornographic sites) etc.
Skin tone detection involves choosing a colour space, providing a skin model for the colour space and processing regions obtained from an image using the skin model to fit any specific application.
There exist several colour spaces including, for example, RGB, CMY, XYZ, xyY, UVW, LSLM, L*a*b*, L*u*v*, LHC, LHS, HSV, HSI, YUV, YIQ, YCbCr.
The native representation of colour images is typically the RGB colour space which describes the world view in three colour matrices: Red (R), Green (G) and Blue (B).
Some skin detection algorithms operate in this colour space, for example, Kova{hacek over (c)}, J., Peer, P., and Solina, F., (2003), “Human Skin Colour Clustering for Face Detection”, EUROCON 2003 International Conference on Computer as a Tool, Ljubljana, Slovenia, September 2003 eliminate luminance by basing their approach on RGB components not being close together using the following rules:
An RGB pixel is classified as skin iff.R>95&G>40&B>20&max(R,G,B)−min(R,G,B)>15&|R−G|>15&R>G&R>B 
However, many colour spaces used for skin detection are based on linear transforms from RGB and many of these transformations are directed towards extracting luminance information from colour information to decorrelate luminance from the colour channels.
It is appreciated that the terms illumination and luminance are slightly different and indeed depend on each other. However, for simplicity, in the present specification, they are used interchangeably as each is a function of response to incident light flux or the brightness.
Some literature such as Albiol, A., Torres, L., and Delp, E. J. (2001), “Optimum color spaces for skin detection”, Proceedings of the IEEE International Conference on Image Processing, vol. 1, 122-124 argue that choosing colour space has no implication on the detection given an optimum skin detector is used, in other words all colour spaces perform the same.
By contrast, others discuss in depth the different colour spaces and their performance including Martinkauppi J. B., Soriano M. N., and Laaksonen M. H. (2001), “Behavior of skin color under varying illumination seen by different cameras at different color spaces”, In Proc. of SPIE vol. 4301, Machine Vision Applications in Industrial Inspection IX, pages 102-113, 2001; and Son Lam Phung, Bouzerdoum A., and Chai D., (2005), “Skin Segmentation Using Color Pixel Classification: Analysis and Comparison”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 1, pp. 148-154, January, 2005.
Furthermore, Abadpour A., and Kasaei S., (2005), “Pixel-Based Skin Detection for Pornography Filtering”, Iranian Journal of Electrical & Electronic Engineering, IJEEE, 1(3): 21-41, July 2005 concluded that “in the YUV, YIQ, and YCbCr colour spaces, removing the illumination related component (Y) increases the performance of skin detection process”.
Again however, by contrast Jayaram, S., Schmugge, S., Shin, M. C. and Tsap, L. V. (2004), “Effect of Colorspace Transformation, the Illuminance Component, and Color Modeling on Skin Detection”, Proc of the 2004 IEEE Computer Vision and Pattern Recognition (CVPR'04) IEEE Computer Society conclude that the illumination component provides different levels of information for the separation of skin and non-skin color, thus absence of illumination does not help boost performance.
Hsu R.-L., Abdel-Mottaleb M. and Jain A. K. (2002), “Face detection in color images. IEEE Trans. Pattern Analysis and Machine Intelligence”, vol. 24(5), 696-702, 2002; and Vezhnevets V., Sazonov V., and Andreeva A., (2003), “A Survey on Pixel-Based Skin Color Detection Techniques”, Proc. Graphicon-2003, pp. 85-92, Moscow, Russia, September 2003 disclose dropping luminance prior to any processing as they indicate the mixing of chrominance and luminance data makes RGB based analysis marred and not a very favourable choice for colour analysis and colour based recognition.
The approach of Hsu et al. is shown in more detail in FIG. 1. They use a model based on a concentration of human skin colour in CbCr space for face detection in colour images. As shown in FIG. 1, these two components were calculated after performing a lighting compensation using a “reference white” to normalise the colour appearance.
Yun Jae-Ung., Lee Hyung-Jin., Paul A. K., and Baek Joong-Hwan., (2007) “Robust Face Detection for Video Summary Using Illumination-Compensation and Morphological Processing”, Third International Conference on Natural Computation, 710-714, 24-27 Aug. 2007, added an extra morphological step to the approach of Hsu et al.
Shin, M. C., Chang, K. I., and Tsap, L. V. (2002), “Does colorspace transformation make any difference on skin detection?” IEEE Workshop on Applications of Computer Vision argue and question the benefit of colour transformation for skin tone detection, e.g., RGB and non-RGB colour spaces; and also argue that the use of Orthogonal Colour Space (YCbCr) gives better skin detection results compared to seven other colour transformations.
Also, US 2005/0207643A1, Lee, H. J. and Lee, C. C., discloses clustering human skin tone in the YCbCr space.
Another space, the Log-Opponent (LO) space uses a base 10 logarithm to convert RGB matrices into I, Rg, By. The concept behind such hybrid colour spaces is to combine different colour components from different colour spaces to increase the efficiency of colour components to discriminate colour data.
In Forsyth, D. and Fleck, M. (1999), “Automatic Detection of Human Nudes”, International Journal of Computer Vision 32(1): 63-77. Springer Netherlands, two spaces are used, namely IRgBy and HS from the HSV (Hue, Saturation and Value) colour space. A texture amplitude map is used to find regions of low texture information. The algorithm first locates images containing large areas whose colour and texture is appropriate for skin, and then segregates those regions with little texture. The texture amplitude map is generated from the matrix I by applying 2D median filters.
Nonetheless, there remains a need to provide an improved method of skin tone detection.