1. Field of the Invention
The present invention relates generally to the field of querying a database of images and particularly to querying a database of images using a combination of region based image matching and boundary based image matching.
2. Description of the Related Art
Huge amounts of information are being circulated daily at an uninterrupted pace on the World-Wide Web (WWW). Additionally, in museums and photo stock agencies, millions of images are stored for on-line usage. With the explosive growth in the volume and distribution of information, more intelligent information processing and management has become critical. Various methods of accessing target data have been developed to address issues relating to image retrieval, image clustering, query interface and WWW information retrieval.
The Virage Image Retrieval Engine is a system for image retrieval based on visual features, such as color, shape and texture. See J. R. Bach, et al., "The Virage Image Search Engine: An Open Framework for Image Management," Proceedings of SPIE- The International Society for Optical Engineering: Storage and Retrieval for Still Image and Video Databases IV, pp. 76-87, February 1996. Virage creates feature vectors relating to color, composition, texture and structure. In a matching phase, where a query is matched against potential result candidates, the Virage search engine calculates the distance of a feature vector for each potential result candidate with respect to the query. The result candidates are then presented. The approach presented by Virage focuses on the calculation of the vector and each element of the vector. Such an approach is not intuitive for users. In the registration (image indexing) and matching phases, Virage does not identify objects contained in the image. Rather, the feature vectors are extracted based on the whole image or a block of the image. Thus, it is hard to introduce analysis at an object level, such as object-based navigation and object-based clustering.
QBIC, a system developed at IBM, is another system that supports image retrieval using visual examples. See M. Flickner, et al, "Query by Image and Video Content: The QBIC System," Intelligent Multimedia Information Retrieval edited by Mark T. Maybury, Chapter 1, Reprinted from IEEE Computer, 28(9): 23-31, September, 1995. The image matching in QBIC is based on features such as color, texture, and shape, similar to Virage.
The instant inventor has assisted in developing a Content Oriented Information Retrieval engine (the COIR engine) to index and retrieve a large amount of image and video data. See K. Hirata, et al., "Media-based Navigation for Hypermedia Systems," ACM Hypertext'93 pp. 159-173, November 1993; K. Hirata, et al., "Content-oriented Integration in Hypermedia Systems," ACM Hypertext'96 pp. 11-26, March 1996; and K. Hirata, et al., "Object-based Navigation: An Intuitive Navigation Style for Content-oriented Integration Environment", ACM Hypertext'97 pp. 75-86, April 1997. Each of these three Hirata articles, as well as each of the references discussed throughout is hereby incorporated by reference herein. One of the major characteristics of the COIR engine is that visual and semantic attributes are assigned to an object in an image or video scene. In a phase in which the images are registered, the COIR engine divides the input image into several regions, assuming them to be meaningful objects. The COIR engine extracts visual attribute values, including color, shape and texture-related values, from each region automatically. The semantic attributes are assigned to each region. Both visual and semantic attributes are stored hierarchically based on object structure, which improves the precision of image searching compared with existing feature-based image processing approaches.
The COIR engine provides flexible multimedia search capabilities based on metadata structure, the metadata being the resulting data from the registration phase extracting various features from the images. In the matching phase, where a query is applied to the available images, the COIR engine correlates objects contained in the query image with those in the target image, comparing attribute values at the object level. The COIR engine evaluates the position of the objects and overlapping areas in order to correlate the objects with each other. This matching process enables an intuitive search for users. Users can also integrate visual characteristics and semantics at the object level and can also focus on specific objects in the image or on a relationship between objects to get the images they are interested in. However, a one-to-one correlation between the visual features and semantic meanings does not always exist. In other words, while in many situations it is possible to perform object extraction based on visual characteristics, in other situations, it is very difficult to extract objects using visual characteristics. In the latter situations, the results of object extraction are not what a user might expect. Since object extraction is executed automatically based on the visual characteristics, the COIR engine cannot always extract the object from the image