1. Field of the Invention
This invention relates to storage, indexing, and retrieval of image data, and more particularly to a method and system for generating and retrieving context vectors that represent high-dimensional abstractions of information in images.
2. Description of Background Art
Analysis of image subject content is a time-consuming and costly operation. This analysis is often required for the identification of images of interest in existing image data bases and the routing and dissemination of images of interest in a real-time environment. The conventional approach is to rely upon human intellectual effort to analyze the content of images. It would be desirable to reliably translate image data into representations that would enable a computer to assess the relative proximity of meaning among images in a database.
Certain known document retrieval systems use variable length lists of terms as a representation, but without meaning sensitivity between terms. In such systems, pairs of terms are either synonyms or not synonyms.
So-called xe2x80x9cvector space methodsxe2x80x9d can capture meaning sensitivity, but they require that the closeness of every pair of terms be known. A typical full-scale system with over 100,000 terms might require about 5 billion relationshipsxe2x80x94an impractical amount of information to obtain and store.
Methods have also been proposed for searching documents with fixed length vectors. However, such methods require work on the order of at least the square of the sum of the number of documents and the number of terms. This is impractical for a large corpus of documents, images, or terms.
A document retrieval model based on neural networks that captures some meaning sensitivity has been proposed. A neural network consists of a collection of cells and connections among cells, where every connection has an associated positive or negative number, called a weight or component value. Each cell employs a common rule to compute an output, which is then passed along connections to other cells. The particular connections and component values determine the behavior of the network when some specified xe2x80x9cinputxe2x80x9d cells receive a set of values. A search in a document retrieval system employing a neural network requires multiplication for twice the, product of the number of documents and the number of keywords for each of a plurality of cycles.
Other document retrieval methods use vector representations in a Euclidean space. The kernel or core used in this method comprises non-overlapping documents. This results in small dimensional vectors on the order of seven values. Vectors are generated from the core documents based upon whether or not a term appears in a document. As an alternative, the method starts with a kernel of terms which never co-occur.
It would be desirable to have a computing system that can derive accurate, efficient, and manageable representations of images for later recall, retrieval, and association.
The present invention is directed to a method and system for generating context vectors associated with images in an image storage and retrieval database system. A context vector is a fixed length series of component values or weights representative of meaning or content. Relationships among context vectors are representative of conceptual relationships among their associated items (e.g., information elements comprised in the image). Thus, two items having similar meaning or content have similarly-oriented context vectors, while items having dissimilar meaning or content have roughly orthogonal context vectors. Similarity between items is measured by calculating the dot product of the associated context vectors.
Context vectors may be associated with words, terms, documents, document portions, queries, quantitative data, or any other type of information element. In the present invention, context vectors are associated with information elements, or features, derived by performing wavelet transformations at a plurality of points on each electronically stored image in the database. The transformations provide orientation-sensitive spatial frequencies on the images at a variety of orientation/frequency combinations. These features are combined to form image feature vectors or xe2x80x9cimage vocabularyxe2x80x9d elements analogous to words in text.
A prototypical subset of feature vectors, or atoms (also called information elements), are derived from the set of feature vectors to form an xe2x80x9catomic vocabulary.xe2x80x9d In one embodiment, the prototypical feature vectors are derived by using a vector quantization method (e.g., self organization) in which a vector quantization network is also generated.
The atomic vocabulary is used to define images in the database or any new image in electronic computer-readable form. As above, a wavelet transformation is performed at a plurality of sample points on the image to generate feature vectors representing the image. The generated feature vectors are mapped to the closest atoms in the atomic vocabulary using the vector quantization network. Thus, new images are defined in terms of the established atomic vocabulary.
In one embodiment, a xe2x80x9cstop listxe2x80x9d of high-frequency, low-information, feature vectors is also generated. The stop list can be used to remove high-frequency, low-information, feature vectors when using the atomic vocabulary to represent images.
In order to quantify conceptual relationships among atoms in the atomic vocabulary (and the images they variously represent), context vectors are employed. A context vector is associated with each atom in the atomic vocabulary. A learning law is applied to modify the context vectors as a function of the proximity of the atom to other atoms in the image and the frequency of occurrence of the atom in the image database.
Once the context vectors are established, the context vectors associated with the atoms that define an image are combined to form a summary vector for the image. The summary vector represents the overall meaning or content of the image.
In one embodiment, summary vectors of images are stored in clusters to reduce searching time. Images with similar information content occupy the same cluster. In one embodiment, textual index terms are associated with images in the database, and are automatically assigned to new images. Thus, textual queries can be used to retrieve images.
Images are retrieved using any of a number of query methods (e.g., images, image portions, vocabulary atoms, index terms). The query is converted into a query context vector. A dot product calculation is performed between the query vector and the summary vectors to locate the images having the closest vectors. Retrieved images are displayed in order of vector proximity, which corresponds to relative relevance to the query. In one embodiment, retrieved images are broken into sub-portions and the most relevant portions matching the query vector are highlighted in the image.