1. Field of the Invention
The present invention relates to a image processing technology and particularly relates to a technology for removing influence of specular reflection and shadows which is a problem in processing images shot under general environment.
2. Description of the Prior Art
Conventionally, methods using cameras are widely used for detecting position and posture of an object, recognizing an object, detecting motion of an object, and the like. These are realized by applying image processing, such as pattern matching, optical flow detection, and feature point extraction, to images shot by a camera.
For example, for detecting position and posture of an object from images shot by a camera, there has been proposed a method in which a 3D model of the object is aligned to an image of a stereo camera for synthesis (for example, Patent Document 1).
However, while the image processing as in the above Patent Document 1 is effective for image without noise, it cannot provide sufficient reliability and accuracy for images shot under general environment (for example, Non-patent Document 1). The main factor is that: general image processing assumes objects to be diffuse reflectors, and therefore, it is ignored that color information of image data varies according to the camera position due to influence of specular reflection and that pixel values change largely due to the presence of shadows.
For example, under real environment such as in a home, specular reflection caused by regular reflection of a light source lowers detection accuracy and a recognition rate. FIG. 37 is an illustration schematically showing an image obtained by shooting a mobile phone 201 in a home. Indoors, at least one light source 202 exists generally. When the light source and a camera satisfy a relationship of regular reflection on the surface of the mobile phone 201, a specular reflection region 203 of which pixel value is high appears. For this reason, when the mobile phone 201 is to be detected in pattern matching using a reference image 204 shown in FIG. 38, the specular reflection region 203 has both an intensity and edge information largely different from those of a region 205 corresponding thereto in the reference image 204. As a result, the detection accuracy is lowered largely. In addition, the specular reflection region moves according to the camera position and varies in intensity according to the lighting condition. Here, the light condition means the numbers and position of the lighting condition.
Specular reflection exerts significant influence also on stereo matching processing. FIG. 40 shows images obtained respectively by shooting an object 207 using a stereo camera set 206L, 206R as shown in FIG. 39. As shown in FIG. 40, specular reflection regions 203L, 203R appear in right and left images, respectively. However, the position and the color information of the specular reflection regions 203L, 203R are different from each other, so that the right image and the left image are quite different from each other. This lowers accuracy of stereo matching.
Moreover, this problem is caused not only by specular reflection but also by a shadow (cast shadow) cast by an object existing in the vicinity and a shadow (attached shadow) appearing when the normal direction N of an object forms an angle at more than 90 degrees with a light source direction L (see FIG. 41). When an occluding object 208 exists in the vicinity of the mobile phone 201, as shown in FIG. 37, the occluding object 208 casts a shadow on the mobile phone 201 to generate a shadow region 209 on the mobile phone 201. The shadow region 209 is an image different from the reference image 204 and causes lowering of the accuracy as well as in the case of the specular reflection.
In order to solve the above problems, correction of the specular reflection and the shadow region are performed widely as pre-processing for image processing. Referring to methods for estimating the specular reflection and the shadow region, there are several proposals: a first conventional example (for example, Patent Document 2) in which difference in polarization characteristic between specular reflection and diffuse reflection is utilized and a polarization filter is used; a second conventional example (for example, Patent Document 3) in which a specular reflection region is separated by rotating an object and utilizing a multi-spectrum camera; and a third conventional example (for example, Non-patent Document 2) in which a “linearized image,” which is an image in an ideal state in which specular reflection is not caused, is synthesized by utilizing images of an object illuminated by a light source in various directions and specular reflection and shadow regions are separated by utilizing the linearized image.
However, the first conventional example necessitates mounting of the polarization filter to the camera, which implies difficulty in realization by a general camera. The second conventional example needs shooting while an object is placed on a rotary table, and it is unsuitable for home use.
In the third conventional example only required is to change the position of a light source for illuminating an object and the position of the light source may be unknown. Thus, it is effective under general environment such as in a home.
The third conventional example will be described. First, diffuse reflection, specular reflection, and shadows, which are optical phenomena, will be described with reference to FIG. 42.
When a dichromatic reflection model is assumed, an intensity of an object can be expressed as a sum of a diffuse reflection component and a specular reflection component. In a Lambertian model, a pixel value Id of the diffuse reflection component is expressed by the following expression.Id=n·s  (Expression 1)Wherein, n is a product of the normal direction N of the surface of the object and a diffuse reflectance (albedo), and s is a product of a unit vector in a light source direction and an intensity of the light source.
Shadows are classified into attached shadow, which appearing when the normal direction of an object does not front a light source direction, and cast shadow, which is caused due to occlusion of light by an object. In the case where there is no influence of ambient light and inter-reflection, both of them have an intensity of 0. However, in (Expression 1), attached shadow becomes a negative value and cast shadow becomes a positive value.
Shashua indicates that an image in an arbitrary light source direction can be expressed by linear combination of three images different in light source direction on the assumption of a parallel light source and perfectly diffuse reflection (see Non-patent Document 3). In other words, when three images different in light source direction in terms of vector are I1, I2, and I3, an image Ik in an arbitrary light source direction can be expressed as in the following expression by linear combination.Ik=ck1I1+ck2I2+ck3I3  (Expression 2)Wherein,ck=[ck1ck2ck3]T is called a “linearization coefficient set” for the image Ik. Also, an image generated by a linear sum in this way is called a “linearized image.”
However, a real image includes a shadow and specular reflection, and does not satisfy (Expression 2). In this connection, in the third conventional example, a plurality of images different in light source direction are shot and RANSAC (see Non-patent Document 4) is employed for generating three images which satisfy (Expression 2) and include only diffuse reflection. The thus generated images including only diffuse reflection are called “base images.” By utilizing such base images in the method by Shashua, a linearized image of which lighting condition corresponds to that of a taken image can be generated. The linearized image is expressed by the following expression.IkL=ck1I1B+ck2I2B+ck3I3B  (Expression 3)Wherein, IkL is a linearized image corresponding to an input image Ik, and I1B, I2B, and I3B are three base images generated by the above method. The thus generated linearized image is an ideal image in which specular reflection is not caused. Hence, image processing with no influence of specular reflection and shadows involved can be realized by using this linearized image.
Further, the third conventional example also mentions region classification based on an optical characteristic which utilizes the linearized image. Diffuse reflection, specular reflection, cast shadow, and attached shadow can be separated in accordance with the following relational expression, wherein a pixel value of a pixel p in an input image Ik is ik(p), and a pixel value of a linearized image corresponding thereto is ik(p)L. FIG. 43 illustrates this.
Diffuse reflectionif |ik(p) − ik(p)L|T · ik(p)Specular reflectionif (ik(p) − ik(p)L > T · ik(p)) and (ik(p)L ≧ 0)Cast shadowif (ik(p) − ik(p)L <− T · ik(p)) and (ik(p) < Ts)Attached shadowif (ik(p)L < 0) and (ik(p) < Ts)Patent Document 1: Japanese Patent No. 2961264BPatent Document 2: Japanese Patent No. 3459981B
Paten Document 3: Japanese Patent Application Laid Open Publication No. 2003-85531A
Patent Document 4: Japanese Patent Application Laid Open Publication No. 2004-5509A
Non-patent Document 1: Athuhiko Banno and Katsushi Ikeuchi, “Removing Specularities of Vehicles from Image Sequences by using Spacio-Temporal Images taken by a Moving Camera,” Research Report, Information Processing Society of Japan, CVIM, 2003-CVIM-141, pp. 17-23, 2003
Non-patent Document 2: Yasunori Ishii, Kohtaro Fukui, Yasuhiro Mukaigawa, and Takeshi Shakunaga, “Photometric Linearization Based on Classification of Photometric Factors,” Transaction of Information Processing Society of Japan, vol. 44, no. SIG5 (CVIM6), pp. 11-21, 2003Non-patent Document 3: Shashua A., “Geometry and Photometry in 3D Visual Recognition,” P. D. thesis, Dept. Brain and Cognitive Science, MIT, 1992Non-patent Document 4: M. A. Fischler and R. C. Bolles, “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography,” Communications of the ACM, Volume 24, Issue 6, pp. 381-395