1. Technical Field
Embodiments of the present invention relate to associating items of various types stored in a memory.
2. Background Art
Items of various types such as images, audio recordings, video recordings, and text, are digitally stored and are accessible through computer networks and services such as the Internet and the World Wide Web (“web”). In some cases, these items can be associated with each other based upon their origination sources or based upon specific features detected in them. For example, an image can be related to specific text in an article in which that image appears, articles that have same or similar text can be related to each other, and an image in which a particular object is detected can be related to a text representation of the name of that object. The ability to associate items of various types that relate to a particular subject of interest is important to fully utilize the vast stores of information that are accessible through the likes of the Internet and the web.
Numerous conventional methods are available with which items of various types can be associated with each other. Conventional methods may associate items based on semantic content. However, conventional methods do not adequately scale to take advantage of the very large amounts of data that are available through the web. Furthermore, conventional methods may not adequately determine all useful semantic relationships among various items in very large collections of items.
Image annotation is an application based upon semantically associating items of various types. Known conventional methods for image annotation based on semantic relationships may not scale to very large data sets, and may not adequately determine semantic relationships to benefit from such very large data sets. Many conventional methods are based upon extracting various image features and then training independent simple classifiers, such as linear support vector machines (SVMs), for each category of images. Training independent classifiers for each category of images can be inefficient for large data sets. The performance of independent classifiers can also degrade rapidly with the number of annotations.
An alternative non-parametric conventional approach is to use K-nearest neighbor methods to select, from a training set, images that are nearest in the image feature space to a new image and to annotate the new image based upon the annotations of the nearest images. However, finding a nearest-neighbor with a high degree of accuracy may be highly inefficient when the training data set is very large.
Other conventional approaches include concatenating image features and associated text labels for each training image, and then probabilistically relating new images to the training images. Some conventional approaches cluster pre-annotated training images based on image features, and then determine an annotation for a new image based on a similarity between features of the new image and one or more of the clusters of training images. For example, the annotation for the new image may be an annotation from the closest cluster of training images. Creating new image annotations based primarily on manually annotated images may not scale to very large data sets. Also, probabilistic approaches can be highly inefficient for large data sets, for example, because of having to re-calibrate the probabilities of numerous other relationships when a probability of one relationship is changed.