Various techniques are commonly employed for retrieving images stored in a database. The most conventional technique for storing and retrieving images which match a desired characteristic is to associate key words with each image, such as "portrait", "seascape", "mountain", "presidents", etc. Having associated such key words to the images, a user provides one or more search words to the search or retrieval system, and the system presents one or more images in dependence upon the degree of correspondence between the search words and stored key words. Conventional Internet search engines are examples of such text based retrieval means.
Text based image retrieval, however, requires the categorizing of each picture by keywords, which can be a burdensome process if applied to hundreds or thousands of images; also, the individual choice of keywords limits the effectiveness of the search to the degree of correspondence between the words the categorizer used to describe the stored images, and the words the searcher uses to describe the desired image.
Graphics based retrieval is a more intuitive approach to image retrieval. Conventional graphic based retrieval systems employ various forms of color or pattern matching. A graphics based system, however, can be computationally intensive. Computer images are typically stored as an array of thousands of pixels, and the color of each of the thousands of pixels is encoded as a multi-byte red-green-blue (RGB) value. The comparison of a target image to a collection of reference images based on these thousands of color values is computationally impractical, and a pixel-by-pixel comparison may not provide a measure of similarity that correlates to the human visual system. Practical graphics based systems, therefore, characterize an image based on a descriptive characteristic of the image, and the comparisons among images are based on the descriptive characteristic. The descriptive characteristic of images include, for example, the colors contained within the image, the edges contained within the image, the arrangement of the colors, the orientation of the edges, etc.
A single characterization of an image, however, may be too coarse of an abstraction to distinguish among images. A singular characterization of a seascape may result in a histogram of color such as: 40% blue, 20% brown, and 40% blue-green. A more descriptive characteristic would include the characterization of the blue color being primarily at the top of the image (the sky), the brown in the middle (the beach), and the blue-green at the bottom of the image (the water). In this manner, images that have the same color proportions, but have the blue color located at the lower-left of the image would be characterized differently from a seascape. Conventional graphics based retrieval systems, therefore, also include a partitioning of the image into an array of partitions, each partition occupying a known location in the image. Comparisons among images are based on a comparison of each corresponding partition in the images.
Typically, images are partitioned into dozens or hundreds of partitions, and each partition is characterized by a multidimensional descriptive characteristic, such as a histogram of colors or edges. Comparing one image to another, therefore, requires the comparison of dozens or hundreds of multidimensional characteristics of the images. Comparing a target image to thousands of images in a large reference image database can be computationally infeasible for a real-time image retrieval process.
Therefore, a need exists for a method and apparatus that minimizes the processing time required to compare a target image to a plurality of reference images. Because of the increasing availability of image data, via for example, the Internet, a need also exists for a method and apparatus for image retrieval from a distributed database that allows for the incremental addition of images to the database. A need also exists for a method and apparatus for image retrieval that does not exhibit progressive performance degradation as the size of the database increases.