In recent years, computer users have become more and more reliant upon computers to store and present a wide range of content including news, research, and entertainment. For example, the Internet, through its billions of Web pages, provides a vast and quickly growing library of information and resources.
In order to find desired content, computer users often make use of search utilities. For example, Internet search engines are well known in the art, and commonly known commercial engines include those provided by Google, Yahoo, and Microsoft Network (MSN). In response to a user's search query, an Internet search engine will generally provide a listing of various Web pages that may contain desired content.
Many of today's commercial search engines rely on some common techniques to provide search results. An Internet search engine generally has a substantial database in which content from billions of Web pages is stored and indexed. To gather this Web page data, a utility known as a “Web crawler” scours the Internet and pulls in text and data from known Web sites.
After the Web crawler relays the content of a Web page to the database, the text is parsed and various indices are created. These indices catalog the location of various occurrences of each word on the stored Web pages. An Internet search engine can then utilize the indices to find Web pages that contain desired search terms.
However, often a user's search will yield thousands, if not millions, of “hits,” Web pages containing each of the search terms. Accordingly, providers of search engines are tasked with the challenge of scoring or ranking the various hits. Optimally the scoring/ranking will predict which of the pages will be most useful to the user. It should be noted that any commercially viable search engine must make this ranking determination very quickly so as not to delay the presentation of hits to the user. Because of time constraints, generally, search engine algorithms may perform only one pass through a hit when making scoring/ranking determinations.
Currently available search engines, however, are limited in that they do not strongly consider certain aspects of a document and/or of a user's query when making ranking decisions. For example, conventional search engines do not effectively consider word placement and word order within a user's query when ranking documents. By not given ample weight to word placement and order, conventional search engines often fail to locate exact or near exact phrase matches in a document. This failure may cause highly relevant documents to receive a diminished ranking and to be excluded from presentation to a user. Accordingly, there is a need for an improved search engine that quickly and efficiently scores/ranks search results to find the hits that are most likely to contain content of interest to a user.