This application is related to US co-pending applications respectively entitled METHOD AND APPARATUS FOR REPRESENTING COLORED SURFACES VIA A COLOR CODE BOOK, now U.S. Pat. No. 6,591,007 B1, and METHOD AND APPARATUS FOR DETECTING REGIONS BELONGING TO A SPECIFIED COLOR SURFACE IN AN UNSEGMENTED IMAGE, now U.S. Pat. No. 6,469,706 B1, both hereby incorporated by reference.
1. Field of Invention
The present invention relates generally to the field of image storage databases. More specifically, the present invention is related to recognizing and retrieving target images from a database of images.
2. Discussion of Prior Art
The representation and discussion of color and color perception usually involves first defining a color space or color model. A color space (or model) comprises a plurality of attributes which are used to specify the variety of colors we perceive. Different colors are distinguished by having different attribute values.
One color model, or color space, is not absolutely better than another. They each have been designed from different perspectives and thus have different qualities, as well as handicaps, which make them each suitable for different applications.
RGB color space, probably the best known color model, is related to the hardware capabilities of a CRT which uses an electron gun to excite red, green, and blue phosphors to produce perceivable color. One difficulty with the RGB model is that it is quite sensitive to the variation in color values brought about by lighting changes. Intensity is distributed throughout all three parameters of this model, thus creating color values which are highly sensitive to scene brightness. In essence, the effect of a change in the illumination intensity of a scene is approximated by a multiplication of each RGB color by a uniform scalar factor.
A simple example of RGB color space""s unstableness with respect to illumination is the gross change in perceived color which accompanies changing the brightness value on a computer monitor. Other color spaces have been developed which are more stable with respect to intensity variations. The HSV model is one such example.
Physiologically, human eyes tend to respond to and evaluate colors according to hue (H), saturation (S), and luminance (L). Hue refers to color produced by different wavelengths of light. Saturation refers to the measure of a pure hue in a color. As an example, red and pink are of similar hue but have different saturation levels. Brightness, luminance, or intensity is the measure of the amount of light reflected or emitted from an object. Because the luminance value is separate from H and V, this model shows more stability than the RGB model with respect to illumination differences. However, representation of color in the HSV model is not totally immune to illumination changes.
A number of color spaces have been developed, and will continue to be developed, to address the various applications which rely on color information to accomplish their goals. One useful undertaking, which is well known in the art, is the formulation of transform operations to map one color space into another. Using these transforms, a scene or image created for one application is appropriately mapped to a second color space for an application which benefits from the image being in the second representation.
One imaging application that is receiving a lot of attention is that of a large collection of images stored in a searchable database. In particular, querying such a collection to locate images which contain desired objects requires detection, recognition and indexing of color surfaces or objects within an image.
In the past, image content detection and recognition have relied on controlled imaging conditions of the objects of interest. Background surface color, illumination source, illumination quality, illumination amount, object orientation, shadowing and occlusion were all carefully controlled to ensure that proper recognition of an object was possible.
However, the ability to detect and recognize color surfaces or objects in images, across a wide range of imaging conditions, is fundamental to the creation of robust and useful content-based image retrieval systems. Image databases typically include images from a plurality of sources under a variety of imaging conditions. Consistent representation of a particular color surface throughout these varying conditions is necessary to enable color queries (i.e. queries for a particular color surface). The present invention introduces a color code book to provide an imaging-condition invariant representation of color surfaces present within an image. The prior art discussed below fails to provide for such a robust, compact and simple method of representing color surfaces.
As explained previously, color represented through three filtered channels (e.g. RGB) of a typical camera-grabbed input is notoriously unstable with respect to changes in imaging conditions. Prior art attempts at providing robust and accurate color representations include mapping images into a perceptually uniform color space (e.g. Lab, Munsell, etc.) and representing color as a function of surface reflectance. While attaining some limited success in improving color representation, both these methods require pre-segmentation of an image before color surface recognition analysis can occur in that image. Some prior art methods do avoid a pre-segmentation step but, as a result, require some simplifying approximations to be made such as assuming all surfaces are flat (e.g. Retinex theory) or that surfaces are linear combinations of only a few basic reflectances (e.g. Maloney and Wandel).
Image segmentation is generally recognized as the partition of an image into a set of non-overlapped regions, homogenous with respect to some criteria, whose union covers the entire image.
In the prior art described below, the examples of region localization and recognition methods all require segmentation of an image prior to subsequent analysis. Automated segmentation of images is very computationally and time intensive and often introduces errors due to its imprecision. Manual image segmentation, which is frequently more accurate, is typically labor intensive and cost prohibitive. Segmentation is a difficult and time consuming step because of the variety of artifacts that exist in images. Correctly determining where an object is and what an object is are both made difficult by object occlusion, illumination artifacts, shading, reflections, object orientation, and blending with a similar background or texture.
The present invention, unlike the prior art discussed below, enables the localization and recognition of image regions, or color surfaces, without first requiring pre-segmentation of an image.
The patent to Takahashi et al. (U.S. Pat. No. 4,203,671) provides for a method of detecting, within RGB space, whether or not an image region is human skin under various conditions.
The patent to Gast et al. (U.S. Pat. No. 4,414,635) provides for a method of labeling regions according to the color of pixels within the region, but utilizes a pre-determined light source for imaging.
The patent to Hoffrichter et al. (U.S. Pat. No. 4,623,973) provides for a color recognition method which transforms an RGB pixel value of a pre-isolated color region into a luminance-chrominance space.
The patent to Huntsman (U.S. Pat. No. 4,884,130) provides for a new color space which describes colors in a way more closely related to human color perception.
The patent to Lugos (U.S. Pat. No. 4,917,500) provides for an object recognition system which only works on pre-segmented images under predetermined lighting conditions.
The patents to Inuzuka et al. (U.S. Pat. No. 5,307,088) and Morag et al. (U.S. Pat. No. 5,517,334) provide for a method of quantitizing and transforming pixel colors from one color space to another. Recognizing and locating color surfaces do not appear to be discussed.
The patent to Davis et al. (RE 33,244) provides for a color transparency printing system which dynamically adjusts color look-up tables in response to non-linearities in the components of the system. However, no optimization of the actual color space to assist with surface recognition is provided.
The patent to Washio et al. (U.S. Pat. No. 5,109,274) provides for an image processing system which determines the chromatic and achromatic nature of an image in order to implement appropriate algorithms for that image type.
The patent to Fossum (U.S. Pat. No. 5,220,646) provides for a computer graphics system which implements a z-buffer such that colors of rendered images and background colors can be determined and hidden lines removed in a single memory pass. While applicable to polygon images, this system does not appear to consider color surface detection or coding.
The patent to Shibazaki (U.S. Pat. No. 5,386,483) provides for a method of iteratively examining image regions to determine pixel colors to assist with choosing overlapping regions in order to achieve high printing quality.
The patent to Kasson et al. (U.S. Pat. No. 5,390,035) provides for mapping a 3-dimensional input color domain into an m-dimensional color space. The conversion method is directed towards generating a function which improves accuracy while minimizing computational burden; color surface differentiation does not appear to be discussed.
The patent to Shamir (U.S. Pat. No. 5,568,555) provides for an encoding system which increases symbol density by utilizing both color and color intensity as variables. Linear analysis is performed to calibrate color intensities from different colorant sources; however, a color code book training system or analysis of free-form images is not provided.
The patent to Ring et al. (U.S. Pat. No. 5,754,184) provides for an image fidelity system which converts images from a device-dependent space to a device-independent space.
The IBM TDB entitled xe2x80x9cColor Graphics Picture Segmentationxe2x80x9d describes a method of compressing images which segments neighborhoods of pixels that have similar colors that are separated by pixels having a different color. Each pixel is mapped to one of N allowable colors thus segmenting the image into N color regions.
The IBM TDB entitled xe2x80x9cPresentation Space Coloringxe2x80x9d describes an object-oriented method of generically specifying color which is independent of the underlying system data types and resolution. While not color-space dependent, this scheme does not consider combining color spaces or optimizing a color space to aid with surface recognition.
Whatever the precise merits, features and advantages of the above cited references, none of them achieve or fulfill, individually or in combination, the purposes of the present invention. Specifically, they fail to provide for a compact and simple coding of color surfaces which provides robust representation of the surfaces under varying imaging conditions. Without such a color surface code book, the prior art also fails to provide for an image region localization and recognition method that can be performed on an unsegmented image. In addition, the prior art fails to show an image database indexing method which relies on a color surface code book to provide the ability to perform color queries.
Indexing and retrieving images from an image database, in response to content-based color queries, is accomplished utilizing semantic labels and codes from a color code book. The color code book entries provide descriptions of a variety of colored surfaces under varying imaging conditions. Images from a database are first analyzed to determine which images contain which color surfaces. For each color label in a color code book, an index entry is created linking the label with the image regions located during image analysis. The index allows queries based on semantic labels to identify similar images in the database. In addition, querying by sample images are also possible by first determining which color surfaces (and their corresponding labels) are present in the sample image.