Since the conception of the first search engines significant research, design and production of reliable and efficient text search engines has been seen. The resulting engines provide a high degree of associativity between textual queries and web pages on the Internet. As technology developed, image searching appeared as a natural extension to textual searching. The basic premise was to enable users to enter textual string and return relevant image files to the user based on the entered query.
The determination of relevance in the realm of images, however, is significantly more complex than that of text documents. For example, images are unable to be read in the conventional manner, that is, by extracting keywords and text from the search corpus. Thus, current image search technologies rely on text surrounding an image, such as body text or text within hyperlinks associated with a given image. These methods have been proved to be effective, but far from perfect.
Various other techniques have employed human computation, which involves pushing a portion of the processing load to a user. An example of this strategy is an image labeling algorithm that requests a first and second user to enter a label for a provided image. The algorithm then compares the labels provided by the first and second users, if the label matches, the label is applied to the image, if not, it is discarded, as the conflicting labels indicate that neither is an appropriate label choice for the given image.
Again, this technique provided a slight increase in accuracy, but still suffers from a variety of inherent problems. First, the scope of the process is limited due to language constraints—to enter a label for an image, a user must have a reasonable command of the English language. The use of a two-player architecture forces the two users to have the same level of proficiency of the English language and thus adds further complications to the system. Secondly, the current methodology is unable to handle multiword labels efficiently as the complexity involved in multiword queries is unable to be supported in the current art. Thus there is a need in the art for a system and method for efficiently and accurately labeling images in a search environment.