1. Field of the Invention
The present invention relates to an apparatus and method for recognizing an object belonging to a specific category. More particularly, the present invention relates to an apparatus and method for recognizing an object using a plurality of images representing the object in terms of different attributes.
2. Description of the Related Art
Japanese Laid-Open Publication No. 8-287216, entitled “Internal facial feature recognition method”, discloses a technique for recognizing a target object in a plurality of images representing the target object in terms of different attributes. In this conventional technique, internal facial features (e.g., the mouth) are recognized from a far-infrared light (light having a wavelength of 8 to 10 μm) image and a visible light image (from a plurality of images representing a target object in terms of different attributes). A far-infrared light image represents the intensity of far-infrared light emitted from a target object. Since the intensity of far-infrared light emitted from a target object can correspond to the temperature of a target object, a region having a specific temperature (e.g., about 36° C. which is the typical skin temperature of a human) can be extracted from an far-infrared light image.
Use of only a temperature image often encounters difficulties in detection of a target object. For example, when detecting a human (a target object), if there are other objects (e.g., an electric appliance in a room) having substantially the same temperature as that of a human in the vicinity of the human, it is difficult to detect the human accurately. To avoid this, a region having a skin color in a visible light image is referenced so as to improve the detection of a human.
In a conventional technique as described in the aforementioned publication, in order to locate a feature to be recognized, a matching needs to be established between a skin temperature region extracted from a far-infrared light image and a skin color region extracted from a visible light image. To establish such a matching, the following procedures have to be performed in advance: (i) a skin temperature region (a region having a temperature of about 36° C.) is accurately extracted from a far-infrared light image, and a skin color region is accurately extracted from a visible light image, and (2) a matching is established between pixels in the far-infrared light image and the visible light image.
The matching between the far-infrared light image and the visible light image requires accurate alignment of optical axes of a visible light camera and a far-infrared light camera, leading to a problem that the structure of an image capturing system and the initial settings for object recognition become complicated.
In order to accurately extract a skin temperature region from a far-infrared light image, calibration has to be frequently performed so as to compensate for an influence of the temperature of the optical system, circuits, and elements of a far-infrared light camera which changes over time. Alternatively, an entire far-infrared light camera may be held at a constant temperature so as to eliminate an influence of the temperature of the optical system, circuits, and elements of the far-infrared light camera (e.g., the camera is cooled). Unfortunately, as a result, the settings and maintenance of a recognition system comprising a far-infrared light camera become complicated, leading to an increase in cost.
In addition, skin temperature varies significantly depending on an influence of sunlight or ambient temperature. Especially in the outdoors, skin temperature is likely to be deviated far from a standard temperature of 36° C. due to variations in conditions, such as sunlight and ambient temperature. Skin temperature also varies depending on time slots in a day. If skin temperature varies in this manner, it becomes difficult to accurately detect a skin temperature region from a far-infrared light image. In order to accurately extract a skin temperature region under varying environmental conditions, a different extraction algorithm has to be prepared for each set of environmental conditions, which makes it difficult to provide the initial settings of the recognition system.
A visible light image also has difficulties in the accurate detection of the color of a target object in an environment, such as the outdoors, in which a camera is easily affected by sunlight, a headlight of a car, or the like. This is because the limit of the dynamic range of a camera or the spectral distribution of a light source cannot be fixed. In order to accurately extract a skin color region under varying environmental conditions, a different extraction algorithm has to be prepared for each set of environmental conditions, which makes it difficult to provide the initial settings of the recognition system.
Moreover, the extraction of a skin temperature region from a far-infrared light image and the extraction of a skin color region from a visible light image are processes specific to the attributes of an individual target object. Such processes do not work when a target object is changed. For example, a region extraction algorithm has to be newly prepared when the above-described conventional technique is applied to the recognition of an animal. Therefore, an extraction algorithm has to be prepared for each object to be recognized, which makes it difficult to provide the initial settings of a recognition system.
As described above, according to the conventional technique, the requirement of establishing the matching between a region in a far-infrared light image and a region in a visible light image is attributed to the problem that it is practically difficult to provide the settings of recognition of a target object and the recognition system is easily affected by environmental conditions.