The present invention relates generally to techniques for processing images, video and other types of information signals, and more particularly to automated systems and devices for retrieving, matching and otherwise manipulating information signals which include color pattern information.
Flexible retrieval and manipulation of image databases and other types of color pattern databases has become an important problem with applications in video editing, photo-journalism, art, fashion, cataloging, retailing, interactive CAD, geographic data processing, etc. Until recently, content-based retrieval (CBR) systems have generally required a user to enter key words to search image and video databases. Unfortunately, this approach often does not work well, since different people describe what they see or what they search for in different ways, and even the same person might describe the same image differently depending on the context in which it will be used.
One of the earliest CBR systems, known as ART MUSEUM and described in K. Hirata and T. Katzo, xe2x80x9cQuery by visual example, xe2x80x9d Proc. of 3rd Int. Conf. on Extending Database Technology, performs retrieval entirely based on edge features. A commercial content-based image search engine with profound effects on later systems was QBIC, described in W. Niblack et al. xe2x80x9cThe QBIC project: Quering images by content using color, texture and shape, xe2x80x9d Proc. SPIE Storage and Retrieval for Image and Video Data Bases, February 1994. As color representation, this system uses a k-element histogram and average of (R,G,B), (Y,i,q), and (L,a,b) coordinates, whereas for the description of texture it implements Tamura""s feature set, as described in H. Tamura et al., xe2x80x9cTextural features corresponding to visual perception, xe2x80x9d IEEE Transactions on Systems, Man and Cybernetics, Vol. 8, pp. 460-473, 1982.
In a similar fashion, color, texture and shape are supported as a set of interactive tools for browsing and searching images in the Photobook system developed at the MIT Media Lab, as described in A. Pentland et al., xe2x80x9cPhotobook: Content-based manipulation of image databases, xe2x80x9d International Journal of Computer Vision, 1996. In addition to providing these elementary features, systems such as VisualSeek, described in J. R. Smith and S. Chang, xe2x80x9cVisualSeek: A fully automated content-based query system,xe2x80x9d Proc. ACM Multimedia 96, 1996, Netra, described in W. Y. Ma and B. S. Manjunath, xe2x80x9cNetra: A toolbox for navigating large image databases, xe2x80x9d Proc. IEEE Int. Conf. on Image Processing, 1997, and Virage, described in A. Gupta, and R. Jain, xe2x80x9cVisual information retrieval, xe2x80x9d Communications of the ACM, Vol. 40, No. 5, 1997, each support queries based on spatial relationships and color layout. Moreover, in the above-noted Virage system, the user can select a combination of implemented features by adjusting the weights according to his or her own xe2x80x9cperception.xe2x80x9d This paradigm is also supported in RetrievalWare search engine described in J. Dowe, xe2x80x9cContent based retrieval in multimedia imaging,xe2x80x9d Proc. SPIE Storage and Retrieval for Image and Video Databases, 1993.
A different approach to similarity modeling is proposed in the MARS system, described in Y. Rui et al., xe2x80x9cContent-based image retrieval with relevance feed-back in Mars,xe2x80x9d Proc. IEEE Conf. on Image Processing, 1997, where the main focus is not in finding a best representation, but rather on the relevance feedback that will dynamically adapt multiple visual features to different applications and different users. Hence, although great progress has been made, none of the existing search engines offers a complete solution to the general image retrieval problem, and there remain significant drawbacks with the existing techniques which prevent their use in many important practical applications.
These drawbacks can be attributed to a very limited understanding of color patterns compared to other visual phenomena such as color, contrast or even gray-level textures. For example, the basic dimensions of color patterns have not yet been adequately identified, a standardized and effective set of features for addressing their important characteristics does not exist, nor are there rules defining how these features are to be combined. Previous investigations in this field have concentrated mainly on gray-level natural textures, e.g., as described in the above-cited H. Tamura et al. reference, and in A. R. Rao and G. L. Lohse, xe2x80x9cTowards a texture naming system: Identifying relevant dimensions of texture,xe2x80x9d Vision Res., Vol. 36, No. 11, pp. 1649-1669, 1996. For example, the Rao and Lohse reference focused on how people classify textures in meaningful, hierarchically-structured categories, identifying relevant features used in the perception of gray-level textures. However, these approaches fail to address the above-noted color pattern problem, and a need remains for an effective framework for analyzing color patterns.
The invention provides a perceptually-based system for pattern retrieval and matching, suitable for use in a wide variety of information processing applications. The system is based in part on a vocabulary, i.e., a set of perceptual criteria used in comparison between color patterns associated with information signals, and a grammar, i.e., a set of rules governing the use of these criteria in similarity judgment. The system utilizes the vocabulary to extract perceptual features of patterns from images or other types of information signals, and then performs comparisons between the patterns using the grammar rules. The invention also provides new color and texture distance metrics that correlate well with human performance in judging pattern similarity.
An illustrative embodiment of a perceptually-based system in accordance with the invention uses a predetermined vocabulary comprising one or more dimensions to extract color and texture information from an information signal, e.g., an image, selected by a user. The system then generates a distance measure characterizing the relationship of the selected image to another image stored in a database, by applying a grammar, comprising a set of predetermined rules, to the color and texture information extracted from the selected image and corresponding color and texture information associated with the stored image. For example, the system may receive the selected image in the form of an input image A submitted in conjunction with a query from the user. The system then measures dimensions DIMi(A) from the vocabulary, for i=1, . . . , N, and for each image B from an image database, applies rules Ri from the grammar to obtain corresponding distance measures disti(A, B), where disti(A, B) is the distance between the images A and B according to the rule i.
In accordance with the invention, the vocabulary may include dimensions such as overall color, directionality and orientation, regularity and placement, color purity, and pattern complexity and heaviness. The rules in the grammar may include equal pattern, overall appearance, similar pattern, and dominant color and general impression, with each of the rules expressed as a logical combination of values generated for one or more of the dimensions. The distance measure may include separate color and texture metrics characterizing the similarity of the respective color and texture of the two patterns being compared.
A major advantage of a pattern retrieval and matching system in accordance with the invention is that it eliminates the need for selecting the visual primitives for image retrieval and expecting the user to assign weights to them, as required in most current systems. Furthermore, the invention is suitable for use in a wide variety of pattern domains, including art, photography, digital museums, architecture, interior design, and fashion.