In an image pattern recognition apparatus for recognizing a reflected intensity image of an object, an image captured by reflected light from object surface (the reflected intensity image) is input (image input processing). An image area as a recognition object is extracted from the input image (pattern extraction processing). The image area is converted to a pattern of predetermined size (pattern normalization processing). This pattern is converted to predetermined input data (feature extraction processing). This input data is compared with dictionary data previously registered and a similarity is calculated (similarity calculation processing).
In the pattern extraction processing, a background subtraction method, a temporal subtraction method, and a template matching method are selectively used. In the background subtraction method, a difference between an image not including a recognition object (background image) and an image including the recognition object (input image) is calculated, and an area of large difference value is extracted as an area including the recognition object. In the temporal subtraction method, a difference between two images inputted at different times is calculated, and an area of large difference value is extracted as an area including the recognition object detected by movement. In the template matching method, a template representing image feature of the recognition object is scanned on the input image, and an area of largest correlative value is extracted as an area including the recognition object. The background subtraction method and the temporal subtraction method are superior to the template matching method for quickly executing the pattern extraction processing.
In a similarity calculation processing, a distance evaluation method, a subspace method and a mutual subspace method are selectively used. In the distance evaluation method, input data and dictionary data are respectively represented as a vector of the same dimension and the same feature; a distance between both vectors is evaluated; and an object in the input data is recognized by evaluation. In the subspace method, the dictionary data is represented as a dictionary subspace generated from a plurality of vectors; a distance between the input vector and the dictionary subspace is evaluated; and the object in the input data is recognized by evaluation. In the mutual subspace method, the input data is also represented as an input subspace generated from a plurality of vectors; a distance between the input subspace and the dictionary subspace is evaluated; and the object in the input data is recognized by evaluation. In each method, a similarity between the input data and the dictionary data is converted to a similarity in order to recognize the object.
However, in the background difference method and the time difference method, the following two problems are well known.
(1) If a plurality of objects are included in the input image, the area of the recognition object is not extracted from the input image. As a result, by using the template matching method, each difference area must be verified based on image feature.
(2) If illumination environment changes because of weather variation or time passage, unexpected noise is mixed into the difference value. As a result, the area of the recognition object is not correctly extracted.
In order to solve these problems, it is necessary that th recognition object obtains high difference value in the difference image. Concretely speaking, the following two solution ideas are necessary.
(A) A camera means is controlled in order to capture the recognition object only in the input image.
(B) The difference value is calculated using an image representation not effected by illumination changes.
However, in the prior art, concrete means of two solution ideas (A) (B) are not considered as for above-mentioned two problems (1) (2). As a result, the image pattern recognition to quickly extract the recognition object using the difference is difficult.
Furthermore, in Japanese Patent Disclosure (Kokai) PH9-251534, a person recognition method is disclosed for a person's face as the recognition object. In this method, a pattern extraction processing by the template matching method is combined with a similarity calculation processing by the mutual subspace method. The pattern extraction, the pattern normalization, and the similarity calculation are stably executed for change of facial direction and expression. Especially, in order to extract facial parts such as pupils and nostrils, a separability filter strong in change of illumination is used. In this case, the pattern normalization is executed based on location of the facial parts so that the normalized pattern is not varied by change of facial direction or expression. In this method, the nostrils are used as the facial parts. Therefore, the camera (image input means) is located at lower part of a display to which a user faces in order to capture the nostrils of the user in the image. However, in this method, the following two problems exist.
(3) Concrete or detail condition for location of the camera is not disclosed. The detection of the facial parts is not assured if the camera is arbitrarily located.
(4) In order to stably detect the facial parts of the user from the input image, an idea to positively keep the user in such situation is not disclosed. As a result, the detection of the facial parts fails because of a caprice or whim of the user.
As mentioned-above, in the image pattern recognition method of the prior art, following two problems occur.
(1) A simple recognition object is not captured in the image. As a result, a pattern of the recognition object is not correctly extracted by the difference processing only.
(2) The noise area except for the recognition object is included in the difference value by noise cause such as illumination change. As a result, the pattern of the recognition object is not stably extracted by the difference processing only.
Furthermore, in the person identification method of the prior art, the following two problems occur.
(3) The location method of the camera means to assure the extraction of the facial parts is not apparent. As a result, a possibility to fail to extract the facial parts remains.
(4) A target means to lead the user to assure the extraction of the facial parts does not exist. As a result, the possibility to fail to extract the facial parts remains.