This invention relates to the field of computers, and more particularly, to storage and retrieval of images by computers. More specifically, it relates to comparison of two images in digital form on the basis of their color contents and color distribution. Still more specifically, the presently preferred embodiment of this invention relates to digital image databases, and the classification, storage and retrieval of images based on spatial distribution of color and other color-related features. In one application of the invention, images may be full color, still pictures, obtained from a digital camera, a scanning machine, or a computer graphics software. PicDB(trademark) is a software tool that implements the methods presented herein.
In one aspect, the invention is directed to use with a computer repository constituting a large number of images stored in digital form. Typical applications are a police department""s mug shot database containing photographs of criminals, or an on-line fashion catalog that provides computerized access to pictures of existing designs. Conventionally, sequential browsing was the only method for finding a desired image (i.e., one that satisfies a set of criteria) from the large collection. The search process was simplified by performing some type of pre-processing on images. However, the pre-processing is costly and inflexible. Moreover, in many cases, the desired image may embody characteristics that were not foreseen in the pre-processing phase. Furthermore, information on the existence of a color combination and the location of each color have not been used effectively in the process of search and retrieval of images.
In the traditional style of image management and retrieval, each file containing an image was assigned an identifier, and retrieval of images was on the basis of the identifier. The retrieval process was made more efficient by selecting a meaningful tag rather than a simple, numerical identifier. The problem with this method was that the exact identifier of the desired image had to be known in order to retrieve it. A rather important improvement was the idea of storing the attributes of each image along with the image itself. Such attributes, for example, an English description of what the image is about, were indexed and stored in a simple relational database. This technique, known as content description, is important and indispensable to the task of image retrieval. However, such descriptions are only subjective interpretations of the contents and can capture just a fraction of the many facets of the information that is potentially obtainable from an image.
The foregoing problems have been overcome in part by resorting to indexing systems that associate a set of indexes to each image on the basis of the features that can be found in that image. The existence or non-existence of features of interest is the guiding principle for retrieving the right set of images from the database. In order that the system can deal with unforeseen queries (i.e., those not catered to by the original classification algorithm), a new set of techniques, referred to as retrieval by content, are proposed. In this approach, the retrieval process is not bound to predefined attributes, and many more types of queries can be posed to the image database. This invention relates to a method of content-based indexing and retrieval technique on the basis of color combination and the area of the image in which certain colors are present. These methods are implemented in conjunction with an image database system called PicDB.
It is therefore a broad object of this invention to provide a computer repository for storing digital images in such a way that the visual contents of each image, particularly the existence and the location of any combination of colors in that image, are used for classification, indexing and identification and retrieval of that image.
Another object of this invention is to enhance the performance of image database systems by making use of features relating to color distribution in the process of search and retrieval.
It is a more specific object of this invention to provide users of image database systems, for example PicDB, with the capability of querying the database on the basis of colors that appear in specific areas of images contained in the database.
Briefly, these and other objects of the invention are achieved by a software tool that takes as input: I) the specification of a color or a set of colors, II) the specification of approximate coordinates of where those colors should occur, and III) a set of images that are stored in computer""s disks; and outputs a subset of the given images (III) that satisfy the color criteria specified in (I) and (II). This software is incorporated into a comprehensive process for storage and retrieval of digital images where images are first compressed by an algorithm provided by the JPEG standard prior to being stored. Some of the computations necessary for the current invention are derived directly from the values that the JPEG algorithm provides. Therefore the tool imposes a minimal overhead on the operation of the image database system that makes use of it. This invention allows the users of image databases systems to retrieve the desired images by their color contents.