The evolution of computers and networking technologies from high-cost, low performance data processing systems to low cost, high-performance communication, problem solving and entertainment systems has provided a cost-effective and time saving means to lessen the burden of performing every day tasks such as correspondence, bill paying, shopping, budgeting and information gathering. For example, a computing system interfaced to the Internet, via wire or wireless technology, can provide a user with a channel for nearly instantaneous access to a wealth of information from a repository of web sites and servers located around the world, at the user's fingertips.
Typically, the information available via web sites and servers is accessed via a web browser executing on a web client (e.g., a computer). For example, a web user can deploy a web browser and access a web site by entering the web site Uniform Resource Locator (URL) (e.g., a web address, an Internet address, an intranet address, . . . ) into an address bar of the web browser and pressing the enter key on a keyboard or clicking a “go” button with a mouse. The URL typically includes four pieces of information that facilitate access: a protocol (a language for computers to communicate with each other) that indicates a set of rules and standards for the exchange of information, a location to the web site, a name of an organization that maintains the web site, and a suffix (e.g., com, org, net, gov and edu) that identifies the type of organization.
In some instances, the user knows, a priori, the URL to the site or server that the user desires to access. In such situations, the user can access the site, as described above, via entering the URL in the address bar and connecting to the site. In other cases, the user will know a particular site that such user desires to access, but will not know the URL for such site. To locate the site, the user can simply enter the name of the site into a search engine to retrieve such site. In most instances, however, the user is simply searching for information relating to a particular topic and does not know a name of a site that contains the desirable information. To locate such information, the user employs a search function (e.g., a search engine) to facilitate locating the information based on a query provided by the user. Generating a query that will locate the desired information, however, can be difficult for typical searchers. More particularly, providing a query that sufficiently represents intent of the user (e.g., what information the user intends to locate) is problematic for most users. For instance, empirical data suggests that most search queries are approximately two words in length, which generally is insufficient to locate particular information based upon the query (e.g., the queries are under-specified with respect to information they desire to obtain).
Currently there are a plurality of techniques employed by search engines to assist a user in narrowing a search given an underspecified query. A first approach includes employing humans to manually classify objects in a database (e.g., sites on the Internet) in a logical hierarchical manner. Such systems can be searched efficiently and are highly accurate, but are expensive to build in terms of man-hours required for classifying each object within the hierarchy. Furthermore, this technique cannot achieve sufficient coverage for many users, as objects cannot be searched for until classified. A disparate approach utilizes machine-learned text classification to automatically classify objects within a hierarchical shell. This approach achieves benefits with respect to coverage, and systems built utilizing such approach are less expensive to build (e.g., numerous humans are not required to continuously insert objects in a hierarchy). However, the hierarchical shell requires building, and such text classification schemes are static and cannot morph to fit needs of disparate users. Moreover, systems built utilizing this technique cannot adapt over time without considerable expense in re-arranging the hierarchy.
Conventional search engines can also utilize clustering techniques to mitigate the aforementioned deficiencies. For example, sites can be clustered to facilitate obtaining more relevant results with respect to a search query. A link entitled “more like this” can be associated with a returned result, and selection of the link can facilitate further clustering and/or display of documents within the cluster associated with the “more like this” link. A relevant document (and thus a relevant cluster) located via the query, however, can be returned to the user in a position that indicates that the document is not highly relevant to the query. Thus, the user could be forced to read through pages of documents in order to locate information that the user intended to find. Furthermore, the constant clustering of documents is computationally expensive.
Another exemplary system that conventional search engines employ provides a user with a query when the user's entered query does not return any documents. For instance, a user can desire to find information regarding Mozart's early works. The user, as is typical, may intent to enter an under-specified query of “classical music.” If, however, due to mistake the user enters the query “classical music”, the search engine can determine that no documents are returned utilizing the query (because of the typo in the query). Thereafter, the search engine can prompt the user with a query that the search engine finds is substantially similar to the entered query. For instance, the search engine could prompt the user by asking, “Did you mean ‘classical music?’” If the user answers positively, the correct query can be run and results can be obtained. While such a system is useful with respect to correcting typos and misspellings, it does not provide results that are highly germane to Mozart's early works (the user's true intent). Rather, the user will be flooded with a substantial amount of information that, while related to classical music, is not related to Mozart's early works. For example, the user may have to look through hundreds of listings before locating a document containing desirable information.
Accordingly, there exists a strong need in the art for a searching system and/or methodology that assists a user in utilizing a query that will obtain results according to the user's intent.