The invention relates generally to the field of digital image processing and digital image understanding, and more particular to a system for detecting which regions in photographic and other similar images are of the sky and more particularly to a sky detection system based on color classification, region extraction, and physics-motivated sky signature validation.
Sky is among the most important subject matters frequently seen in photographic images. Detection of sky can often facilitate a variety of image understanding, enhancement, and manipulation tasks. Sky is a strong indicator of an outdoor image for scene categorization (e.g., outdoor scenes vs. indoor scenes, picnic scenes vs. meeting scenes, city vs. landscape, etc.). See, for example M. Szummer and R. W. Picard, xe2x80x9cIndoor-Outdoor Image Classification,xe2x80x9d in Proc. IEEE Intl. Workshop on Content-based Access of Image and Video Database, 1998 and A. Vailaya, A. Jain, and H. J. Zhang, xe2x80x9cOn Image Classification: City vs. Landscape,xe2x80x9d in Proc. IEEE Intl. Workshop on Content-based Access of Image and Video Database, 1998 (both of which are incorporated herein by reference). With information about the sky, it is possible to formulate queries such as xe2x80x9coutdoor images that contain significant skyxe2x80x9d or xe2x80x9csunset imagesxe2x80x9d etc. (e.g., see J. R. Smith and C.-S. Li, xe2x80x9cDecoding Image Semantics Using Composite Region Templates,xe2x80x9d in Proc. IEEE Intl. Workshop on Content-based Access of Image and Video Database, 1998, incorporated herein by reference). Thus, sky detection can also lead to more effective content-based image retrieval.
For recognizing the orientation of an image, knowledge of sky and its orientation may indicate the image orientation for outdoor images (contrary to the common belief, a sky region is not always at the top of an image). Further, in detecting main subjects in the image, sky regions can usually be excluded because they are likely to be part of the background.
The most prominent characteristic of sky is its color, which is usually light blue when the sky is clear. Such a characteristic has been used to detect sky in images. For example, U.S. Pat. No. 5,889,578, entitled xe2x80x9cMethod and Apparatus for Using Film Scanning Information to Determine the Type and Category of an Imagexe2x80x9d by F. S. Jamzadeh, (which is incorporated herein by reference) mentions the use of color cue (xe2x80x9clight bluexe2x80x9d) to detect sky without providing further description.
U.S. Pat. No. 5,642,443, entitled, xe2x80x9cWhole Order Orientation Method and Apparatusxe2x80x9d by Robert M. Goodwin, (which is incorporated herein by reference) uses color and (lack of) texture to indicate pixels associated with sky in the image. In particular, partitioning by chromaticity domain into sectors is utilized by Goodwin. Pixels with sampling zones along the two long sides of a non-oriented image are examined. If an asymmetric distribution of sky colors is found, the orientation of the image is estimated. The orientation of a whole order of photos is determined based on estimates for individual images in the order. For the whole order orientation method in Goodwin to be successful, a sufficiently large group of characteristics (so that one with at least an 80% success rate is found in nearly every image), or a smaller group of characteristics (with greater than a 90% success rate -which characteristics can be found in about 40% of all images) is needed. Therefore, with Goodwin, a very robust sky detection method is not required.
In a work by Saber et al. (E. Saber, A. M. Tekalp, R. Eschbach, and K. Knox, xe2x80x9cAutomatic Image Annotation Using Adaptive Color Classificationxe2x80x9d, CVGIP: Graphical Models and Image Processing, vol. 58, pp. 115-126, 1996, incorporated herein by reference), color classification was used to detect sky. The sky pixels are assumed to follow a 2D Gaussian probability density function (PDF). Therefore, a metric similar to the Mahalonobis distance is used, along with an adaptively determined threshold for a given image, to determine sky pixels. Finally, information regarding the presence of sky, grass, and skin, which are extracted from the image based solely on the above-mentioned color classification, are used to determine the categorization and annotation of an image (e.g., xe2x80x9coutdoorxe2x80x9d, xe2x80x9cpeoplexe2x80x9d).
Recognizing that matching natural images solely based on global similarities can only take things so far. Therefore, Smith, supra, developed a method for decoding image semantics using composite regions templates (CRT) in the context of content-based image retrieval. With the process in Smith, after an image is partitioned using color region segmentation, vertical and horizontal scans are performed on a typical 5xc3x975 grid to create the CRT, which is essentially a 5xc3x975 matrix showing the spatial relationship among regions. Assuming known image orientation, a blue extended patch at the top of an image is likely to represent clear sky, and the regions corresponding to skies and clouds are likely to be above the regions corresponding to grass and trees. Although these assumptions are not always valid, nevertheless it was shown in Smith, supra, that queries performed using CRTs, color histograms and texture were much more effective for such categories as xe2x80x9csunsetsxe2x80x9d and xe2x80x9cnaturexe2x80x9d.
The major drawback of conventional techniques is that they cannot differentiate other similarly colored or textured subject matters, such as a blue wall, a body of water, a blue shirt, and so on. Furthermore, some of these techniques have to rely on the knowledge of the image orientation. Failure to reliably detect the presence of sky, in particular false positive detection, may lead to failures in the downstream applications.
The invention provides a robust sky detection system which is based on color hue classification, texture analysis, and physics-motivated sky trace analysis. The invention utilizes hue color information to select bright, sky colored pixels and utilizes connected component analysis to find potential sky regions. The invention also utilizes gradient to confirm that sky regions are low in texture content and segments open space, defined as smooth expanses, to break up adjacent regions with similar sky color beliefs but dissimilar sky colors. The invention also utilizes gradient to determine the zenith-horizon direction and uses a physics-motivated sky trace signature to determine if a candidate region fits a sky model.
More specifically, the invention can take the form of a method, image recognition system, computer program, etc., for detecting sky regions in an image and comprises classifying potential sky pixels in the image by color, extracting connected components of the potential sky pixels, eliminating ones of the connected components that have a texture above a predetermined texture threshold, computing desaturation gradients of the connected components, and comparing the desaturation gradients of the connected components with a predetermined desaturation gradient for sky to identify true sky regions in the image.
The desaturation gradients comprise desaturation gradients for red, green and blue trace components of the image and the predetermined desaturation gradient for sky comprises, from horizon to zenith, a decrease in red and green light trace components and a substantially constant blue light trace component.
The color classifying includes forming a belief map of pixels in the image using a pixel classifier, computing an adaptive threshold of sky color, and classifying ones of the pixels that exceed the threshold comprises identifying a first valley in a belief histogram derived from the belief map. The belief map and the belief histogram are unique to the image.
The invention also determines a horizontal direction of a scene within the image by identifying a first gradient parallel to a width direction of the image, identifying a second gradient perpendicular to the width direction of the image and comparing the first gradient and the second gradient. The horizontal direction of the scene is identified by the smaller of the first gradient and the second gradient.
One advantage of the invention lies in the utilization of a physical model of the sky based on the scattering of light by small particles in the air. By using a physical model (as opposed to a color or texture model), the invention is not likely to be fooled by other similarly colored subject matters such as bodies of water, walls, toys, and clothing. Further, the inventive region extraction process automatically determines an appropriate threshold for the sky color belief map. By utilizing the physical model in combination with color and texture filters, the invention produces results which are superior to conventional systems.
The invention works very well on 8-bit images from sources including film and digital cameras after pre-balancing and proper dynamic range adjustment. The sky regions detected by the invention show excellent spatial alignment with perceived sky boundaries.