Various methods for search and retrieval of images, such as by a search engine over a wide area network, are known in the art. Such methods typically employ text-based searching. Text-based searching employs a search query that comprises one or more textual elements such as words or phrases. The textual elements are compared to an index or other data structure to identify documents such as web pages that include matching or semantically similar textual content, metadata, file names, or other textual representations.
The known methods of text-based searching work relatively well for text-based documents, however they are difficult to apply to image files. In order to search image files via a text-based query the image file is associated with one or more textual elements, such as a title, file name, or other metadata or tags. The search engines and algorithms employed for text-based searching cannot search image files based on the content of the image and thus, are limited to identifying search result images based only on the data associated with the images.
Image metadata is typically derived from parent page text or cross page anchor text. Unfortunately, parent page text and cross page anchor text is not always available. Even when it is available, it is not always relevant to the image. In instances where it is relevant, it is often difficult to accurately extract the relevant portion of the text. This difficult leads to inaccurate search results that create a frustrating experience for users searching for images. A more accurate method for annotating and ranking images is needed so the relevance of images associated with image searches can be improved.