During the last decade, the proliferation of image scanners, and digital cameras, along with the advent and subsequent rapid growth of the World Wide Web has led to substantial increases in the number of digitally stored images. The taking and storing of digital pictures has been further encouraged by significant reduction in memory costs. Image files are now commonly included in multimedia documents.
The image files that a person or organization can readily collect can quickly become so numerous that being able to locate particular image files in the collection that are sought for some purposes is potentially time consuming.
Traditionally the field of Information Retrieval (IR) has been focused on searching structured, and/or unstructured text documents. As an extension of traditional IR, one approach to locating images relies on applying traditional IR techniques to short annotation, which are written by a user for each image file.
A newer category of methods know as Content Based Image Retrieval (CBIR) use algorithms to operate on image files in order compute a quantitative characterization (e.g., vector) or each image files, and then uses such quantitative characterizations to judge the degree of similarity of two or more documents. The quantitative characterization can address features such as the color, texture, and shapes included in the image files. Typically in performing a search, a user would select an image file as a basis to be used in the search, and then select one or more particular CBIR algorithms to be used. A search engine would then try to find images that have quantitative characterization, calculated per the selected CBIR algorithms, which are close to the quantitative characterization of the basis image file. Such CBIR techniques are improving and useful however they have not rendered searching of text data associated with image files obsolete. The latter is still very useful.
It would be desirable to be able to enjoy the advantages of a variety of different information retrieval techniques simultaneously in one system.