1. Field of the Invention
The present invention relates generally to image recognition technology, and more particularly, to a method and an apparatus for detecting a lip region for lip reading of an image including a face.
2. Description of the Related Art
Conventional speech recognition technology uses a voice signal such that there is a problem in that ambient noise affects recognition performance. In order to solve such a problem, technology for recognizing voice using image information of the lips, tongue, teeth, etc., of a speaker included in an image, i.e., lip-reading or Visual Speech Recognition (VSR) technology, is currently being researched and developed.
A sequence for processing an image signal for the lip-reading includes detecting a lip region and extracting a lip characteristic.
For detecting the lip region, information of a center point, width, height, etc., of the lips of the speaker is detected from an entire image of the input signal based on color information. According to the conventional detection of the lip region, a face region included in the image is detected based on the color information and then the lip region is detected within the detected face region. Such a detection of the lip region uses geometric information of the face or is implemented based on color information of the lip.
However, a color or contrast of a face varies according to the skin color of a person and also varies according to a race, such that it is difficult to detect the face region on a basis of the collective color. Further, illumination changes cause will change the color information such that the performance for detecting the face region is greatly deteriorated. In particular, illumination is more seriously changed in the use environment of a mobile communication terminal that is not usually used within a predetermined place such that the performance for detecting the face region based on the color information is greatly deteriorated. In this respect, if the image signal is processed for lip-reading in the mobile communication terminal, there is a problem of great deterioration of the performance for detecting the face region.