The World Wide Web (Web), as its name suggests, is a decentralized global collection of interlinked information—generally in the form of “pages” that may contain text, images, and/or media content—related to virtually every topic imaginable. A user who knows or finds a uniform resource locator (URL) for a page can provide that URL to a Web client (generally referred to as a browser) and view the page almost instantly. Since Web pages typically include links (also referred to as “hyperlinks”) to other pages, finding URLs is generally not difficult.
What is difficult for most users is finding URLs for pages that are of interest to them. The sheer volume of content available on the Web has turned the task of finding a page relevant to a particular interest into what may be the ultimate needle-in-a-haystack problem. To address this problem, an industry of search providers (e.g., Yahoo!, MSN, Google) has evolved. A search provider typically maintains a database of Web pages in which the URL of each page is associated with information (e.g., keywords, category data, etc.) reflecting its content. The search provider also maintains a search server that hosts a search page (or site) on the Web. The search page provides a form into which a user can enter a query that usually includes one or more terms indicative of the user's interest. Once a query is entered, the search server accesses the database and generates a list of “hits,” typically URLs for pages whose content matches keywords derived from the user's query. This list is provided to the user. Since queries can often return hundreds, thousands, or in some cases millions of hits, search providers have developed sophisticated algorithms for ranking the hits (i.e., determining an order for displaying hits to the user) such that the pages most relevant to a given query are likely to appear near the top of the list. Typical ranking algorithms take into account not only the keywords and their frequency of occurrence but also other information such as the number of other pages that link to the hit page, popularity of the hit page among users, and so on.
To further facilitate use of their services, some search providers now offer “search toolbar” add-ons for Web browser programs. A search toolbar typically provides a text box into which the user can type a query and a “Submit” button for submitting the query to the search provider's server. Once installed by the user, the search toolbar is generally visible no matter what page the user is viewing, enabling the user to enter a query at any time without first navigating to the search provider's Web site. Searches initiated via the toolbar are processed in the same way as searches initiated at the provider's site; the only difference is that the user is spared the step of navigating to the search provider's site.
While automated search technologies can be very helpful, they do have a number of limitations, a primary one being that users struggle to convey enough contextual information to direct the search to relevant content. An overly broad query (too little context) can return a few needles of relevant content buried in a haystack of irrelevant hits; an overly narrow query (too much context) may result in filtering out the needles along with the hay. Often a user has a fairly specific context in mind, but this specific context may not be reflected in a query. For example, a user who enters the query “jaguar” might be thinking of the automobile, the animal, the professional football team, or something else entirely.
In principle, contextual information might be gleaned from what the user was doing prior to entering the query. It is well known that users are often inspired to conduct searches when information they are currently reviewing raises a further question. For example, a user who enters the query “jaguar” after (or while) viewing an automobile-related page is most likely interested in the automobile while one who enters the same query after (or while) viewing a page about zoos is most likely interested in the animal. Existing search technologies do not provide reliable ways of gathering such contextual information or using it to respond to a query.
Therefore, it would be desirable to provide systems and methods for more efficiently identifying related content.