Many search engine services, such as Google and Overture, provide for searching for information that is accessible via the Internet. These search engine services allow users to search for display pages, such as web pages, that may be of interest to users. After a user submits a search request or query that includes search terms, the search engine service identifies web pages that may be related to those search terms. To quickly identify related web pages, the search engine services may maintain a mapping of keywords to web pages. This mapping may be generated by “crawling and indexing” the web (i.e., the World Wide Web) to identify the keywords of each web page. To crawl the web, a search engine service may use a list of root web pages to identify all web pages that are accessible through those root web pages. The keywords of any particular web page can be identified using various well-known information retrieval techniques, such as identifying the words of a headline, the words supplied in the metadata of the web page, the words that are highlighted, and so on. The search engine service then ranks the web pages of the search result based on the closeness of each match, web page popularity (e.g., Google's PageRank), and so on. The search engine service may also generate a relevance score to indicate how relevant the information of the web page may be to the search request. The search engine service then displays to the user links to those web pages in an order that is based on their rankings.
These search engine services may, however, not be particularly useful in certain situations. In particular, it can difficult to formulate a suitable search request that effectively describes the needed information. For example, if a person sees a flower on the side of a road and wants to learn the identity of the flower, the person when returning home may formulate the search request of “picture of yellow tulip-like flower in Europe” (e.g., yellow tulip) in hopes of seeing a picture of the flower. Unfortunately, the search result may identify so many web pages that it may be virtually impossible for the person to locate the correct picture assuming that the person can even accurately remember the details of the flower. If the person has a mobile device, such as a personal digital assistant (“PDA”) or cell phone, the person may be able to submit the search request while at the side of the road. Such mobile devices, however, have limited input and output capabilities, which make it both difficult to enter the search request and to view the search result.
If the person, however, is able to take a picture of the flower, the person may then be able to use a Content Based Information Retrieval (“CBIR”) system to find a similar looking picture. Although the detection of duplicate images can be achieved when the image database of the CBIR system happens to contain a duplicate image, the image database will not contain a duplicate of the picture of the flower at the side of the road. If a duplicate image is not in the database, it can be prohibitively expensive computationally, if even possible, to find a “matching” image. For example, if the image database contains an image of a field of yellow tulips and the picture contains only a single tulip, then the CBIR system may not recognize the images as matching.