1. Field of Disclosure
The present disclosure generally relates to the field of content image searching and specifically the field of image sorting by color.
2. Brief Description of Related Art
A primary challenge in the design of a content-based image retrieval system involves identifying meaningful color attributes that can be extracted from the content and used to rank the content in accordance with the degree of relevance to a particular search term request. In statistics, a histogram is a graphical representation showing a visual impression of the distribution of data. It is an estimate of the probability distribution of a continuous variable. A histogram consists of tabular frequencies, shown as adjacent rectangles, erected over discrete intervals (bins or buckets), with an area equal to the frequency of the observations in the interval. The height of a rectangle is also equal to the frequency density of the interval, i.e., the frequency divided by the width of the interval. The total area of the histogram is equal to the number of data.
A histogram may also be normalized displaying relative frequencies thereby displaying the proportion of cases that fall into each of several categories, with the total area equaling or summing to one. The categories are usually specified as consecutive, non-overlapping intervals of a variable. The categories or intervals must be adjacent, and often are chosen to be of the same size. The rectangles of a histogram are drawn so that they touch each other to indicate that the original variable is continuous.
Generally, the histogram provides a compact summarization of the distribution of data in an image. The color histogram of an image is relatively invariant with translation and rotation about the viewing axis, and varies only slowly with the angle of view. By comparing histograms signatures of two images and matching the color content of one image with the other, the color histogram can be particularly well suited for the problem of recognizing an object of unknown position and rotation within a scene.
The first deficiency of image histograms is that while the representation of an object in the image histogram is dependent upon the color of an object being studied the image histograms ignore the object's shape and texture. Due to this apparent drawback, the color histograms can potentially be identical for two images with different object content but have identical color information. Put another way, without spatial or shape information, similar objects of different color may be indistinguishable based solely on the color histogram comparisons. Consequently, there is no way to distinguish a green leaf from a green box. A second deficiency of image histograms is that image histogram-based algorithms do not distinguish between “generic” and “specific” objects. For example, a representation of a green leaf is not useful when given an otherwise identical except for another color red leaf.
A third deficiency of image histograms is that they have a high sensitivity to noisy interference such as lighting intensity changes and quantization errors. Generally, translation of a red-green-blue, “RGB,” image into the illumination invariant red-green chromaticity, i.e., RG-chromaticity space normally allows the histogram to operate well in varying light levels. One overriding problem in image searching is that there may be a misinterpretation between words (i.e., the way one searches) and images (i.e., for what is searching, in a particular instance). There is a need to resolve these issues by adding a text tag to the image, thereby adding an additional dimension for comparative and descriptive purposes.