Sky is among the most important subject matters frequently seen in photographic images. In a digital color image, a pixel or region represents sky if it corresponds to a sky region in the original scene. In essence, a pixel or region represents sky if it is an image of the earth's atmosphere. Detection of sky can often facilitate a variety of image understanding, enhancement, and manipulation tasks. Sky is a strong indicator of an outdoor image for scene categorization (e.g., outdoor scenes vs. indoor scenes, picnic scenes vs. meeting scenes, city vs. landscape, etc.). See, for example M. Szummer and R. W. Picard, “Indoor-Outdoor Image Classification,” in Proc. IEEE Intl. Workshop on Content-based Access of Image and Video Database, 1998 and A. Vailaya, A. Jain, and H. J. Zhang, “On hnage Classification: City vs. Landscape,” in Proc. IEEE Intl. Workshop on Content-based Access of Image and Video Database, 1998 (both of which are incorporated herein by reference). With information about the sky, it is possible to formulate queries such as “outdoor images that contain significant sky” or “sunset images” etc. (e.g., see J. R. Smith and C.-S. Li, “Decoding Image Semantics Using Composite Region Templates,” in Proc. IEEE Intl. Workshop on Content-based Access of Image and Video Database, 1998, incorporated herein by reference). Thus, sky detection can also lead to more effective content-based image retrieval.
For recognizing the orientation of an image, knowledge of sky and its orientation may indicate the image orientation for outdoor images (contrary to the common belief, a sky region is not always at the top of an image). Further, in detecting main subjects in the image, sky regions can usually be excluded because they are likely to be part of the background.
The most prominent characteristic of sky is its color, which is usually light blue when the sky is clear. Such a characteristic has been used to detect sky in images. For example, U.S. Pat. No. 5,889,578, entitled “Method and Apparatus for Using Film Scanning Information to Determine the Type and Category of an Image” by F. S. Jamzadeh, mentions the use of color cue (“light blue”) to detect sky without providing further description.
Commonly assigned U.S. Pat. No. 5,642,443, entitled, “Whole Order Orientation Method and Apparatus” by Robert M. Goodwin, (which is incorporated herein by reference) uses color and (lack of) texture to indicate pixels associated with sky in the image. In particular, partitioning by chromaticity domain into sectors is utilized by Goodwin. Pixels with sampling zones along the two long sides of a non-oriented image are examined. If an asymmetric distribution of sky colors is found, the orientation of the image is estimated. The orientation of a whole order of photos is determined based on estimates for individual images in the order. For the whole order orientation method in Goodwin to be successful, a sufficiently large group of characteristics (so that one with at least an 80% success rate is found in nearly every image), or a smaller group of characteristics (with greater than a 90% success rate, which characteristics can be found in about 40% of all images) is needed. Therefore, with Goodwin, a very robust sky detection method is not required.
In a work by Saber et al. (E. Saber, A. M. Tekalp, R. Eschbach, and K. Knox, “Automatic Image Annotation Using Adaptive Color Classification”, CVGIP: Graphical Models and Image Processing, vol. 58, pp. 115-126, 1996, incorporated herein by reference), color classification was used to detect sky. The sky pixels are assumed to follow a 2D Gaussian probability density function (PDF). Therefore, a metric similar to the Mahalonobis distance is used, along with an adaptively determined threshold for a given image, to determine sky pixels. Finally, information regarding the presence of sky, grass, and skin, which are extracted from the image based solely on the above-mentioned color classification, are used to determine the categorization and annotation of an image (e.g., “outdoor”, “people”).
Recognizing that matching natural images solely based on global similarities can only take things so far. Therefore, Smith, supra, developed a method for decoding image semantics using composite regions templates (CRT) in the context of content-based image retrieval. With the process in Smith, after an image is partitioned using color region segmentation, vertical and horizontal scans are performed on a typical 5×5 grid to create the CRT, which is essentially a 5×5 matrix showing the spatial relationship among regions. Assuming known image orientation, a blue extended patch at the top of an image is likely to represent clear sky, and the regions corresponding to skies and clouds are likely to be above the regions corresponding to grass and trees. Although these assumptions are not always valid, nevertheless it was shown in Smith, supra, that queries performed using CRTs, color histograms and texture were much more effective for such categories as “sunsets” and “nature”.
In commonly assigned U.S. Pat. No. 6,504,951, Luo and Etz show that blue sky appears to be desaturated near the horizon, causing a gradual gradient across a sky region. Sky is identified by examining such gradient signal of candidate sky region. The classification of sky is given to regions exhibiting an acceptable gradient signal. While the method described provides excellent performance, especially in eliminating other objects with similar colors to blue sky, the algorithm may fail to detect small regions of sky (e.g. a small region of sky visible between tree branches) because the small region is not large enough to exhibit the proper gradient signal.