Photographs contain a wide variety of subject matters. Examples of primary, most frequently seen subject matters are sky, grass, sand, snow, and so on. Sky is among the most important subject matters frequently seen in photographic images. Detection of sky can often facilitate a variety of image understanding, enhancement, and manipulation tasks. Sky is a strong indicator of an outdoor image for scene categorization (e.g., outdoor scenes vs. indoor scenes, picnic scenes vs. meeting scenes, city vs. landscape, etc.). See, for example M. Szummer and R. W. Picard, “Indoor-Outdoor Image Classification,” in Proc. IEEE Intl. Workshop on Content-based Access of Image and Video Database, 1998 and A. Vailaya, A. Jain, and H. J. Zhang, “On Image Classification: City vs. Landscape,” in Proc. IEEE Intl. Workshop on Content-based Access of Image and Video Database, 1998 (both of which are incorporated herein by reference). With information about the sky, it is possible to formulate queries such as “outdoor images that contain significant sky” or “sunset images” etc. (e.g., see J. R. Smith and C.-S. Li, “Decoding Image Semantics Using Composite Region Templates,” in Proc. IEEE Intl. Workshop on Content-based Access of Image and Video Database, 1998, incorporated herein by reference). Thus, sky detection can also lead to more effective content-based image retrieval.
The most prominent characteristic of sky is its color, which is usually light blue when the sky is clear. In the case of cloudy or overcast skies, there is a larger variation in sky color. However, even for cloudy and overcast skies, the sky regions tend to be the brightest regions in an image. Unlike clear sky, cloudy or overcast sky tend to contain higher level texture information. Sky color has been used to detect clear sky in images. For example, U.S. Pat. No. 5,889,578 issued Mar. 30, 1999 to Jamzadeh (which is incorporated herein by reference), mentions the use of color cue (“light blue”) to detect sky without providing further description.
U.S. Pat. No. 5,642,443, issued Jun. 24, 1997 to Goodwin (which is incorporated herein by reference), uses color and (lack of) texture to indicate pixels associated with sky in the image. In particular, partitioning by chromaticity domain into sectors is utilized by Goodwin. Pixels with sampling zones along the two long sides of a non-oriented image are examined. If an asymmetric distribution of sky colors is found, the orientation of the image is estimated. The orientation of a whole order of photos is determined based on estimates for individual images in the order. For the whole order orientation method in Goodwin to be successful, a sufficiently large group of characteristics (so that one with at least an 80% success rate is found in nearly every image), or a smaller group of characteristics (with greater than a 90% success rate—which characteristics can be found in about 40% of all images) is needed. Therefore, with Goodwin, a very robust sky detection method is not required.
In a work by Saber et al. (E. Saber, A. M. Tekalp, R. Eschbach, and K. Knox, “Automatic Image Annotation Using Adaptive Color Classification”, CVGIP: Graphical Models and Image Processing, vol. 58, pp. 115-126, 1996, incorporated herein by reference), color classification was used to detect sky. The sky pixels are assumed to follow a 2D Gaussian probability density function (PDF). Therefore, a metric similar to the Mahalonobis distance is used, along with an adaptively determined threshold for a given image, to determine sky pixels. Finally, information regarding the presence of sky, grass, and skin, which are extracted from the image based solely on the above-mentioned color classification, are used to determine the categorization and annotation of an image (e.g., “outdoor”, “people”).
Recognizing that matching natural images solely based on global similarities can only take things so far. Therefore, Smith, supra, developed a method for decoding image semantics using composite regions templates (CRT) in the context of content-based image retrieval. With the process in Smith, after an image is partitioned using color region segmentation, vertical and horizontal scans are performed on a typical 5×5 grid to create the CRT, which is essentially a 5×5 matrix showing the spatial relationship among regions. Assuming known image orientation, a blue extended patch at the top of an image is likely to represent clear sky, and the regions corresponding to skies and clouds are likely to be above the regions corresponding to grass and trees. Although these assumptions are not always valid, nevertheless it was shown in Smith, supra, that queries performed using CRTs, color histograms and texture were much more effective for such categories as “sunsets” and “nature”.
The major drawback of conventional techniques for subject matter detection is that they cannot identify primary subject matters, such as cloudy and overcast sky reliably, because of the lack of consideration of unique characteristics of the subject matters. Furthermore, some of these techniques have to rely on the a priori knowledge of the image orientation. Failure to reliably detect the presence of primary subject matters, in particular false positive detection, may lead to failures in the downstream applications (e.g., falsely detected sky regions may lead to incorrect inference of image orientation). Therefore, there is a need for a more robust primary subject detection method.