The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
A search engine is a computer program that helps a user to locate information using alphanumeric input. Using a search engine, a user can enter one or more search query terms and obtain a list of resources that contain or are associated with subject matter that matches those search query terms. While search engines may be applied in a variety of contexts, search engines are especially useful for locating resources that are accessible through the Internet. Resources that may be located through a search engine include, for example, files whose content is composed in a page description language such as Hypertext Markup Language (HTML). Such files are typically called pages. One can use a search engine to generate a list of Universal Resource Locators (URLs) and/or HTML links to files, or pages, that are likely to be of interest.
Search engines order a list of files before presenting the list to a user. As used herein, “files” may refer, but is not limited to, any type of document that may be searched by a search engine, including web pages, web documents, or other retrievable files. To order a list of files, a search engine may assign a rank to each file in the list. When the list is sorted by rank, a file with a relatively higher rank may be placed closer to the head of the list than a file with a relatively lower rank. The user, when presented with the sorted list, sees the most highly ranked files first. To aid the user in his search, a search engine may rank the files according to relevance. Relevance is a measure of how closely the subject matter of the file matches the user's query terms.
To find the most relevant files, search engines typically try to select, from among a plurality of files, files that include many or all of the words that a user has entered into a search request. Unfortunately, the files in which a user may be most interested are too often files that do not exactly match the words that the user entered as query terms.