Shadows pose a challenging problem in many computer vision applications. Shadows in images correspond to areas in a background of a scene that are blocked from a light source. Shadows distort the shape and color of an object, making it difficult to detect and track the object. Two types of shadows are defined. Cast shadows are behind an object, with respect to a light source, while self shadows are due to occlusions of the object itself.
Therefore, shadows cast by the object should be removed while self shadows, which are parts of the object itself that are not illuminated, should be retained so that a complete object silhouette can be obtained.
There are a number of cues that indicate the presence of a shadow in an image. For example, pixel luminance within shadow regions decreases, when compared to a reference background. Shadow regions retain a texture of the underlying surface under general viewing conditions. Thus, the intensity reduction rate changes smoothly between neighboring pixels.
Furthermore, most shadow regions do not have strong edges, H. T. Chen, H. H. Lin, T. L. Liu, “Multi-object tracking using dynamical graph matching,” CVPR, 2001. Spatially, moving cast shadow regions should adjoin the object.
Most prior art shadow removal methods are based on an assumption that the shadow pixels have the same chrominance as the background but with a decreased luminance. One method classifies a pixel into one of four categories depending on a the distortion of the luminance and the amount of the chrominance of the difference, T. Horprasert, D. Harwood, and L. Davis, “A statistical approach for real-time robust background subtraction and shadow detection,” Proc. of IEEE ICCV Frame-rate Workshop, 1999. A similar method verifies the above criteria by integrating a color model based on Phong shading, J. Stauder, R. Mech, and J. Ostermann, “Detection of moving cast shadows for object segmentation,” IEEE Transactions on Multimedia, vol. 1, 1999. Another method classifies pixels according to a statistical model, I. Mikic, P. Cosman, G. Kogut, and M. Trivedi, “Moving shadow and object detection in traffic scenes,” ICPR, vol. 1, pp. 321-324, 2000.
Color change under changing illumination is described by a von Kries rule. Each color channel is approximately multiplied by a single overall multiplicative factor.
Some methods remap the color space because the hue of a shadow cast on a background does not change significantly, R. Cucchiara, C. Grana, M. Piccardi, and A. Prati, “Detecting objects, shadows and ghosts in video streams by exploiting color and motion information,” Proc. of IEEE CIAP, 2001.
Another method recovers an invariant image from a three-band color image, G. D. Finlayson, M. Drew, and C. Lu, “Intrinsic images by entropy minimization,” ECCV, 2004. That method finds an intrinsic reflectivity image based on assumptions of Lambertian reflectance, approximately Planckian lighting, and fairly narrowband camera sensors.
A similar method also makes use of an illumination invariant image, H. Jiang, M. Drew, “Shadow resistant tracking in video,” ICME, 2003. If lighting is approximately Planckian, then as the illumination color changes, a log-log plot of (R/G) and (B/G) values for any single surface forms a straight line. Thus, lighting change reduces to a linear transformation. Approximately invariant illumination spaces are first used to transform the color space. This color space is approximately invariant to shading and intensity changes, albeit only for matte surfaces under equi-energy white-illumination, T. Gevers and A. W. Smeulders, “Color-based object recognition,” Patt. Rec., vol. 32, pp. 453-464, 1999.
Other methods perform image segmentation. For example, a potential shadow region can be segmented into sub-regions, O. Javed and M. Shah, “Tracking and object classification for automated surveillance,” ECCV, 2002. For each candidate shadow segment and its respective background, gradients are correlated. If the correlation is greater than a threshold, then the candidate segment is considered a cast shadow, and the cast shadow is removed from the foreground region.
Obviously, one drawback is that not all images include statistically significant amounts of object surfaces corresponding to both directly lit and shadowed pixels. Furthermore, the lighting color of the umbra region is not always the same as that of the sunshine.
One method removes shadows using a measure of brightness, I. Sato and K. Ikeuchi, “Illumination distribution from brightness in shadows,” ICCV (2), pp. 875-882, 1999. The image is segmented into several regions that have the same density. Shadow regions are determined based on the brightness and the color. That method can be extended by applying maximum and minimum value filters followed by a smoothing operator to obtain a global brightness of the image. From the global brightness, the shadow density can be determined, M. Baba and N. Asada, “Shadow removal from a real picture,” SIGGRAPH Conference on Sketches and Applications, 2003.
Another method segments the image in two stages, E. Salvador, A. Cavallaro, and T. Ebrahimi, “Shadow identification and classification using invariant color models,” ICASSP, 2001. The first stage extracts moving cast shadows in each frame of a sequence of frames. The second stage tracks the extracted shadows in subsequent frames. Obviously, the segmentation-based approach is inherently degraded by inaccuracies of the segmentation.
A geometrical method assumes that the shadow is in form of an ellipsoid. Any foreground pixel which lies in the shadow ellipse having an intensity that is lower than that of the corresponding pixel in the background, according to a threshold, is classified as a shadow pixel, T. Zhao, R. Nevatia, “Tracking multiple humans in complex situations,” PAMI, vol. 26, no. 9, 2004.
Unfortunately, the assumptions of the above methods are difficult to justify in general. Detection based on the luminance-based criteria fails when pixels of foreground objects are darker than the background and have a uniform gain with respect to the reference background. Color space transformations are deficient when the background color is gray. Geometrical shadow models depend heavily on the viewpoint and the object shape. It is not possible remove shadows correctly for a wide range of conditions with several predefined parameters. Another limitation of the above methods is that those methods do not adapt to different types of shadow, e.g., light due to a weak ambient light source, or heavy due to strong spotlights.
One observation is that cast shadows constitute a ‘prevalent’ change in color. In other words, a color change at a pixel due to an object has a higher variance, because the object can have different colors when compared to a color change due to cast shadows. For a pixel, cast shadows cause identical background color change. However, color changes caused by object motion are different in the case where the object colors are different, which is the usual case.