The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
A search engine is a computer application program that helps a user to locate information based upon the user's alphanumeric input. Using a search engine, a user may enter one or more search query terms to obtain a list of resources or documents that contain or are associated with subject matter that matches those search query terms. While search engines may be applied in a variety of contexts, search engines are especially useful for locating resources or documents that are accessible through the Internet. Resources that may be located through a search engine include, but are not limited to, web documents composed in Hypertext Markup Language (HTML), word processing documents, pictures or other media, or any other type of web document that may be located and retrieved on the Internet. Once the user enters a search query, the search engine generates a list of Universal Resource Locators (URLs) and/or other links to files, documents, or pages, that are likely to be of interest to the user based upon the search query terms.
Most major search engines generate and maintain an index of the Internet by sending an automated web crawler or bot, around the World Wide Web in order to find new web documents and existing, updated web documents. The web crawler makes a copy of each web document found and adds each web document's contents to the index. When a user enters a search query term into a search engine, the search engine analyzes its index and displays a list of results based upon the search query terms.
Search engines order the list of web document results before presenting the list to the user based upon one or more proprietary algorithms. To order the list of web documents, a search engine may assign a rank or value to each document in the list. When the list is sorted by rank, a file with a relatively higher rank or value may be placed closer to the head of the list than a file with a relatively lower rank. The user, when presented with the sorted list, sees the most highly ranked files first. Each search engine ranks web documents based upon their own algorithm. Thus, if a search query is entered in a first search engine and the same query is entered in a second search engine, the results of the queries may be different with web documents having different rankings or not showing up in the results list at all.
The accuracy of ranked files may be measured by relevance. As used herein, “relevance” is a measure of how accurately a particular web document matches the user's search query terms input. For example, the search query “foreign films” would return a list of web documents about films from foreign countries above web documents about films, generally, because web documents about films from foreign countries are likely to be more relevant. Search engines constantly modify their search algorithms in order to provide the most relevant web documents to a user based upon their search query. Otherwise users may resort to competing search engines that provide more relevant results. Thus, identifying and improving relevancy in search results is critical to all search engines companies.