The reflectance of a textured materials, including plastics, plant leaves, cloth, wood, and human skin, can be described as a linear combination of diffuse and specular reflection components. When a scene fits this description, there are benefits to decomposing its image into the diffuse and specular components to facilitate computer analysis. The presence of highlights causes many algorithms in computer vision to produce erroneous results because most of these algorithms (e.g. stereo, optical flow, segmentation, tracking, and recognition) assume that the reflections from all surfaces are Lambertian (diffuse). In practice, non-Lambertian effects such as specular reflections are either ignored or treated as noisy measurements. By decomposing an image into its diffuse and specular components, powerful Lambertian-based computer vision algorithms can be successfully applied. In addition, the specular reflection component is helpful in determining the surface properties of the imaged object, and some computer algorithms rely solely on this component.
In addition to its uses in image analysis and computer vision, the separation of specular and diffuse components also has applications in photo editing and image-based modeling. In image-based modeling, for example, unless the illumination is carefully controlled when acquiring the images, the acquired texture maps often include undesirable artifacts due to specularities.
Computing a specular/diffuse separation is a difficult problem, and many methods have been proposed for separation of the two components, including optical methods, such as by the use of polarizing filters, and computer-based methods, such as color modeling. While many of the separation techniques have relied on global approaches, a few have proposed evaluation of local interactions. An advantage of the local approach is that it admits highly textured scenes that do not contain piecewise constant diffuse colors. In most local methods, the illuminant color is assumed to be known a priori, which is not a significant restriction because it can often be estimated using established global methods.
Reflectance is described by the bi-direction reflectance distribution function (BRDF). The BRDF is considered to be a five-dimensional function of wavelength and imaging geometry and can be written as ƒ(λ, θ), where λ is the wavelength of light and θ=(θ1,φi,θrφr) parameterizes the directions of the incoming irradiance and outgoing radiance (using spherical coordinates in the local coordinate system of a surface point). A Lambertian, or purely diffuse, surface is one whose BRDF is a constant function of θ.
Reflection component separation in a single image started with the work of Shafer (Color Research and Applications, 10(4):210-218 (1985)), who introduced the dichromatic reflection model. The dichromatic model of reflectance is a special case of the BRDF model that was originally developed by Shafer to model dielectrics. The model assumes that the BRDF of the surface can be decomposed into two additive components: the interface (specular) reflectance and the body (diffuse) reflectance. Furthermore, it assumes that each of these two components can be factored into a univariate function of wavelength and a multivariate function that depends on the imaging geometry. That is,ƒ(λ,θ)=gd(λ)ƒd(θ)+gs(λ){tilde over (ƒ)}s(θ)  (1)
The functions gd(λ) and gs(λ) are respectively referred to as the diffuse and specular spectral reflectance and are intrinsic properties of the material. The functions ƒd (constant for Lambertian surfaces) and ƒs are the diffuse and specular BRDFs, respectively.
Specular reflection is due to light that is reflected at the surface without penetration, whereas diffuse reflection is due to light which penetrates the surface and is scattered (possibly multiple times) before being re-emitted. Since the index of refraction of most surfaces boundaries is uniform over the visible spectrum, the function gd(λ) is typically a constant function, and as a result, the specular color is often independent of the spectral reflectance of the surface and depends only on the spectral power distribution of the illumination and the spectral sensitivities of the sensors. Unlike the specular reflection, the diffuse reflection for these materials is highly color (or wavelength) dependent as the light that penetrates the surface is absorbed and scattered in a manner that may favor certain (colors) wavelengths and attenuate others. For these materials, the color of the diffuse reflectance is thus a product of the spectral reflectance of the surface in addition to the spectral power distribution of the illumination and the spectral sensitivities of the sensors. (See, e.g., M. F. Cohen and J. R. Wallace, Radiosity and Realistic Image Synthesis, Morgan Kaufmann, 1993.).
When this is case, and gs(λ) is a constant function, Equation 1 reduces to the common expression for the BRDF of a dichromatic surface,ƒ(λ,θ)=gd(λ)ƒd(θ)+ƒs(θ)  (2)where ƒs(θ)=gs(λ){tilde over (ƒ)}s(θ).
Taking into account the spectral power distribution (SPD) of a light source L(λ) and a camera sensitivity function Ck(λ), the image formation equation for a surface element with surface normal {circumflex over (n)}, illuminated by a light source with direction {circumflex over (l)} is written per Equation (3)Ik=(Dkƒd(θ)+Skƒs(θ)){circumflex over (n)}·{circumflex over (l)},  (3)where Dk=∫Ck(λ)L(λ)gd(λ)dλ and Sk=∫Ck(λ)L(λ)dλ. Under more general illumination distributions (as opposed to a point source) this is writtenIk=σdDk+σsSk,with σd and σs being referred to as the diffuse and specular geometric scale factors. Since it often encodes a simple relation between the lighting direction and the surface normal, the term σd is said to encode “diffuse shading information”.
In these equations, Sk represents the effective source strength as measured by the kth sensor channel and is independent of the surface being observed. Similarly, Dk is the effective albedo in the kth channel. For a multi-channel imaging device such as a color digital or video camera, there are typically three channels, while for a hyperspectral imaging devices device may have many more channels.
An RGB color vector I=[I1, I2, I3]T=[IR, IG, IB]T from a typical color camera consists of three such measurements, each with a different sensitivity function with support in the visible spectrum. For notational simplicity, S=[S1, S2, S3]T=[SR, SG, SB]T (with a corresponding definition for D), and since scale can be absorbed by ƒd and ƒs, the assumption ∥D∥=∥S∥=1 is made. The vectors S and D are called the specular color and diffuse color, respectively. In RGB color space a collection of color vectors from a dichromatic material under multiple view and illumination configurations (i.e., different values of θ) which lie in a plane—the plane spanned by the effective source and body colors, S and D. This plane is referred to as the “dichromatic plane.”
It has been observed that color vector in the dichromatic plane often cluster into the shape of a ‘skewed-T’, with the two limbs of the skewed-T formed by linear clusters of diffuse and specular pixels. When these limbs are sufficiently distinct, the diffuse and source colors can be recovered, the two components can be separated, and the highlights can be removed. In scenes consisting of a single surface material, this approach can be used to estimate a single “global” diffuse color, and in principle, this approach can be extended to cases in which an image is segmented into several regions of homogeneous diffuse color.
While this method works well for homogeneous, dichromatic surfaces in the noiseless case, there are three significant limitations that make it difficult to use in practice. First, many surfaces are textured and violate the homogeneous assumption. Even when an image does contain homogeneous surfaces, a non-trivial segmentation process is required to identify them. Second, in order for the specular and diffuse limbs of the skewed-T to be distinct, the specular lobe must be sufficiently narrow (i.e., its angular support must be small relative to the curvature of the surface.) Finally, when the diffuse and specular colors are the same, there is no way to distinguish between the two components, and no color separation is possible. Some of these restrictions can be relieved, but not eliminated, by using additional cues such as polarization to estimate the source color at each point. (Nayar et al. (Intl. J. Computer Vision, 21(3): 163-186 (1997)).)
Related to these methods are a small number of color-based transformation methods that exploit knowledge of the illuminant to provide a partial dichromatic separation as opposed to an complete and explicit separation. Tan and Ikeuchi obtain a one-channel diffuse image through the transformation in Equation 4
                              I          d                =                                            3              ⁢                                                max                  k                                ⁢                                  (                                                            I                      k                                        /                                          S                      k                                                        )                                                      -                                          ∑                k                            ⁢                              (                                                      I                    k                                    /                                      S                    k                                                  )                                                                        3              ⁢                                                          ⁢                              λ                ~                                      -            1                                              (        4        )            where k∈{1,2,3}, and the bounded quantity ⅓<{tilde over (λ)}<1 is chosen arbitrarily. This transformation yields a positive monochromatic diffuse image, which can be seen by expanding Equation (1) using equation (Ik=(Dkƒd+Skƒs(θ)){circumflex over (n)}·{circumflex over (l)}) and assuming that I1/S1>I2/S2, I3/S3. In this case, Equation 5 is
                                                                                             I                  d                                =                                                                            2                      ⁢                                                                        I                          1                                                /                                                  S                          1                                                                                      -                                                                  I                        2                                            /                                              S                        2                                                              -                                                                  I                        3                                            /                                              S                        3                                                                                                                        3                      ⁢                      λ                                        -                    1                                                                                                                          =                                                                                                    (                                                                              2                            ⁢                                                                                          D                                1                                                            /                                                              S                                1                                                                                                              -                                                                                    D                              2                                                        /                                                          S                              2                                                                                -                                                                                    D                              3                                                        /                                                          S                              3                                                                                                      )                                            ⁢                                              f                        d                                            ⁢                                                                        n                          ^                                                ·                                                  l                          ^                                                                                                                                    3                        ⁢                        λ                                            -                      1                                                        .                                                                                          (          5          )                    Since the expression is independent of ƒs and is directly related to {circumflex over (n)}·{circumflex over (l)}, the positive image Id is specular-free and depends directly on diffuse shading information.
An alternate transformation is proposed by Park (Intelligent Robots and Computer Vision XXI: Algorithms, Techniques, and Active Vision (Proc. SPIE, Vol. 5267) pp. 163-174 (2003)). Park's transformation isolates two predominantly diffuse channels while retaining a similarity to HSI color space. The transformation is composed of a linear transformation Lp and rotation Rp, and is written:Ip=RpLpI, with RpLpS=[0 0 2]T.  (6)The matrices Rp and Lp are chosen such that the third color axis is aligned with the illumination color. As a result, that channel contains the majority of the specular component, leaving the other two channels to be predominantly diffuse.