The World Wide Web (Web), as its name suggests, is a decentralized global collection of interlinked information, generally in the form of “pages” that may contain text, images, and/or media content related to virtually every topic imaginable. A user who knows or finds a uniform resource locator (URL) for a page can provide that URL to a Web client (generally referred to as a browser) and view the page almost instantly. Since Web pages typically include links (also referred to as “hyperlinks”) to other pages, finding URLs is generally not difficult.
What is difficult for most users is finding URLs for pages and other resources that are of interest to them. The sheer volume of content available on the Web has turned the task of finding a page relevant to a particular interest into what may be the ultimate needle-in-a-haystack problem. To address this problem, an industry of search providers (e.g., Yahoo!, MSN, and Google) has evolved.
A search provider typically maintains a database of Web pages in which the URL of each page is associated with information (e.g., keywords, category data, etc.) reflecting its content. The search provider also maintains a search server that hosts a search page (or site) on the Web. The search page provide a form into which a user can enter a query that usually includes one or more terms indicative of the user's interest. Once a query is entered, the search server accesses the database and generates a list of “hits,” typically URLs for pages whose content matches keywords derived from the user's query. This list is provided to the user.
Since queries can often return hundreds, thousands, or in some cases millions of hits, search providers have developed sophisticated algorithms for ranking the hits (i.e., determining an order for displaying hits to the user) such that the pages most relevant to a given query are likely to appear near the top of the list. Typical ranking algorithms take into account not only the keywords and their frequency of occurrence but also other information such as the number of other pages that link to the hit page, popularity of the hit page among users, and so on. These ranking algorithms are an important part of algorithmic search.
To further facilitate use of their services, some search providers now offer “search toolbar” add-ons for Web browser programs. A search toolbar typically provides a text box into which the user can type a query and a “Submit” button for submitting the query to the search provider's server. Once installed by the user, the search toolbar is generally visible no matter what page the user is viewing, enabling the user to enter a query at any time without first navigating to the search provider's Web site. Searches initiated via the toolbar are processed in the same way as searches initiated at the provider's site; the only difference is that the user is spared the step of navigating to the search provider's site.
While automated search technologies can be very helpful, they do have a number of technological limitations, a primary one being that a user often has difficulty formulating a query to direct the search to relevant content. A query that is too general might return a large quantity of hits, few of which are relevant. A query that is too specific might fail to return many relevant hits.
Contextual information provides a means of directing a user's search to more relevant content. A user often has a fairly specific context in mind at the time of making a query, but the query might not unambiguously express this context. So for example, a user who enters the query “jaguar” might be thinking of the automobile, rather than the animal, the professional football team, or something else. But the entered query “jaguar” does not express this specific context.
In principle and in practice, contextual information might be gleaned from what the user was doing before or at the time of entering the query. A user is often inspired to conduct a search when prompted by information the user is currently viewing. Returning to the example above, a user who enters the query “jaguar” after (or while) viewing an automobile-related page is most likely interested in the automobile, while a user who enters the same query after (or while) viewing a page about zoos is most likely interested in the animal.
Until recently, search technologies did not provide reliable ways of gathering such contextual information or using it to respond to a query. However, as shown by the above cross-references, at least one search provider now provides an interface for gathering contextual information from a user and using that gathered information when processing the user's query.
Still this new search technology has limitations of its own. Sometimes, a user's contextual information will be quite large, as for example, when the user selects or enters, as context, the textual content of all or part of a web page. When given large amounts of contextual information, the new search technology might return hits that are even less relevant than those returned without the use of contextual information. In addition, contextual or other augmented search systems may rank the results in a manner less meaningful to the user than algorithmic searches.
Less relevant hits might also result from contextual information that is quite small when that contextual information is misdirecting. This might occur, for example, when the user makes or adopts a spelling error when entering contextual information. Consequently, there is a need to improve contextual and other augmented search systems.