The present invention relates to digital colour images and, in particular, to the detection of faces in colour digital images.
Colour digital images are increasingly being stored in multi-media databases, and utilised in various computer applications. In many such applications it is desirable to be able to detect the location of a face in a visual image as one step in a multi-step process. The multi-step process can include content-based image retrieval, personal identification or verification for use with automatic teller machines or security cameras, or automated interaction between humans and computational devices.
Various prior art face detection methods are known including eigenfaces, neural networks, clustering, feature identification and skin colour techniques. Each of these techniques has its strengths and weaknesses, however, one feature which they have in common is that they are computationally intensive and therefore very slow, or they are fast but not sufficiently robust to detect faces.
The eigenface or eigenvector method is particularly suitable for face recognition and there is some tolerance for lighting variation, however it does not cope with different viewpoints of faces and does not handle occlusion of various facial features (such as occurs if a person is wearing sunglasses). Also it is not scale invariant.
The neural network approach utilises training based on a large number of face images and non-face images and has the advantages of being relatively simple to implement, providing some tolerance to the occlusion of facial features and some tolerance to lighting variation. It is also relatively easy to improve the detection rate by re-training the neural network using false detections. However, it is not scale invariant, does not cope with different viewpoints or orientation, and leads to an exhaustive process to locate faces on an image.
The clustering technique is somewhat similar to the eigenface approach. A pixel window (e.g. 20xc3x9720) is typically moved over the image and the distance between the resulting test pattern and a prototype face image and a prototype non-face image is represented by a vector. The vector captures the similarity and differences between the test pattern and the face model. A neural network can then be trained to classify as to whether the vector represents a face or a non-face. While this method is robust, it does not cope with different scales, different viewpoints or orientations. It leads to an exhaustive approach to locate faces and relies upon assumed parameters.
The feature identification method is based upon searching for potential facial features or groups of facial features such as eyebrows, eyes, nose and mouth. The detection process involves identifying facial features and grouping these features into feature pairs, partial face groups, or face candidates. This process is advantageous in that it is relatively scale invariant, there is no exhaustive searching, it is able to handle the occlusion of some facial features and it is also able to handle different viewpoints and orientations. Its main disadvantages are that there are potentially many false detections and that its performance is very dependent upon the facial feature detection algorithms used.
The use of skin colour to detect human faces is described in a paper by Yang J and Waibel A (1995) xe2x80x9cTracking Human Faces in Real-Timexe2x80x9d CMU-CS-95-210, School of Computer Science, Carnegie Mellon University. This proposal was based on the concept that the human visual system adapts to different brightnesses and illumination sources which implies that the human perception of colour is consistent within a wide range of environmental lighting conditions. It was therefore thought possible to remove brightness from the skin colour representation while preserving accurate, but low dimensional, colour information. As a consequence, in this prior art technique, the chromatic colour space was used. Chromatic colours (e.g. r and g) can be derived from the RGB values as:
r=R/(R+G+B) and g=G/(R+G+B)
These chromatic colours are known as xe2x80x9cpurexe2x80x9d colours in the absence of brightness.
Utilising this colour space, Yang and Waibel found the distribution of skin colour of different people, including both different persons and different races, was clustered together. This means that the skin colours of different people are very close and that the main differences are in differences of intensity.
This prior art method first of all generated a skin colour distribution model using a set of example face images from which skin colour regions were manually selected. Then the test image was converted to the chromatic colour space. Next each image in the test image (as converted) was then compared to the distribution of the skin colour model. Finally, all skin colour pixels so detected were identified, and regions of adjacent skin colour pixels could then be considered potential face candidates.
This prior art method has the advantage that processing colour is much faster than processing individual facial features, that colour is substantially orientation invariant and that it is insensitive to the occlusions of some facial features. The system is also substantially viewpoint invariant and scale invariant. However, the method suffers from a number of disadvantages including that the colour representation of the face can be influenced by different lighting conditions, and that different cameras (e.g. digital or film) can produce different colour values even for the same person in the same environment.
However a significant disadvantage of the prior art methods is that the skin colour model is not very discriminating (ie selecting pixels on a basis of whether they are included in the skin colour distribution results in a lot of non-skin colour pixels being included erroneously). It is also difficult to locate clusters or regions of skin colour pixels that can be considered as candidate faces.
An object of the present invention is to provide an improved method of detecting one or more faces in digital colour images.
In accordance with a first aspect of the present invention there is disclosed a method of detecting a face in a colour digital image, said method comprising the steps of:
1. segmenting said image into a plurality of regions each having a substantially homogenous colour,
2. testing the colour of each said region created in step 1 to determine those regions having predominantly skin colour, and
3. subjecting only the regions determined in step 2 to further facial feature analysis whereby said regions created in step 1 not having a predominantly skin colour are not subjected to said further feature analysis.
Preferably the above described method includes the further step of using in step 2 a colour distribution model utilising previously characterised sampled data.
Still further, step 2 preferably also includes the use of a more well defined colour distribution model for a particular image (e.g. flash or non-flash) based on information provided from the image source (e.g. a camera).
Preferably the further facial feature analysis is independent of facial colour.
In accordance with a second aspect of the present invention there is disclosed apparatus for detecting a face in a colour digital image, said apparatus comprising,
segmenting means to segment said image into a plurality of regions each having a substantially homogeneous colour,
colour detecting means coupled to said segmenting means to determine those regions having predominantly skin colour,
analysis means coupled to said colour detection means to subject only those regions having predominantly skin colour to a facial feature analysis.
In accordance with a third aspect of the present invention there is disclosed a computer readable medium incorporating a computer program product for detecting a face in a colour digital image, said computer program product including a sequence of computer implementable instructions for carrying out the steps of:
1. segmenting said image into a plurality of regions each having a substantially homogenous colour,
2. testing the colour of each said region created in step 1 to determine those regions having predominantly skin colour, and
3. subjecting only the regions determined in step 2 to further facial feature analysis whereby said regions created in step 1 not having a predominantly skin colour are not subjected to said further feature analysis.