In a typical search system, a user using a client system issues a search query to search a document corpus and receives a set of search results via the client system. The search query may be issued from the client system to a search engine that is configured to search the document corpus, or an index thereof, for content that is relevant to the search query. The search engine may send a summary of the identified content in the form of a set of search results to the client system. The search results might include titles, abstracts, and/or links for the identified pieces of content. The search query and search results may be routed between the client system and the search engine over one or more networks, and by one or more servers coupled to the network. In many cases, the search results comprise many more hits than the querier can contemplate, so only the first few hits might be examined. Therefore, ordering search results is important as users perceive quality of search often by which hits are ordered first.
The network might be a local network, a global internetwork of networks, or a combination of networks. Common local networks in use today include local area networks (LANs), wide area networks (WANs), virtual LANs (VLANs) and the like. One common global internetwork of networks in use today is referred to as the Internet, wherein nodes of the network send the search query to other nodes that might respond with the search results relevant to the search query. One protocol usable for networks that include search systems is the Hypertext Transport Protocol (HTTP), wherein an HTTP client, such as a browser program operating on the client system, issues a query for search results referenced by a Uniform Resource Locator (URL), and an HTTP server responds to the query by sending search results specified by the URL. Of course, while this is a very common example, the issuance of a query and the sending of a set of search results relevant to the query are not so limited.
For example, networks other than the Internet might be used, such as a token ring, a WAP (wireless application protocol) network, an overlay network, a point-to-point network, proprietary networks, etc. Moreover, protocols other than HTTP might be used to request and transport search results, such as SMTP (Simple Mail Transfer Protocol), FTP (File Transfer Protocol), HTTPS (hypertext transfer protocol secure), etc. Further, content might be specified by other than URLs. It should be understood that references to the Internet can be substituted with references to variations of the basic concept of the Internet (e.g., intranets, virtual private networks, enclosed TCP/IP networks, etc.), as well as other forms of networks. It should also be understood that the operations might occur entirely within one computer or one collection of computers, thus obviating the need for a network.
Requested search results that are relevant to a query could be in many forms. For example, some search results might include text, images, video, audio, animation, program code, data structures, etc. The search results may be formatted according to the Hypertext Markup Language (HTML), the Extensible Markup Language (XML), the Standard Generalized Markup Language (SGML) or other language in use at the time.
HTML is a common format used for pages and other content that are supplied from an HTTP server. HTML-formatted content might include links to other HTML content and a collection of content that references other content might be thought of as a document web, hence the name “World Wide Web” or “WWW” given to one example of a collection of HTML-formatted content. As that is a well-known construct, it is used in many examples herein, but it should be understood that unless otherwise specified, the concepts described by these examples are not limited to the WWW, HTML, HTTP, the Internet, etc.
As described briefly above, a set of search results may include abstracts that identify documents that are relevant to a search query. The search results, however, may include a number of results that are not what the user had in mind when formulating a query (e.g., when formulating a query string). To locate the results the user had in mind, the user may review a number of the results, for example, by scrolling through the search results, which may be displayed as a Web page on the client system. If the search results are relatively lengthy, as is common, the user may become frustrated in attempting to locate the results that the user had in mind and might end their review of the search results. Alternatively, the user might issue another search query via their client system in an attempt to locate the search results the user had in mind.
If the query is well-understood by the search system and is unambiguous, it may be that an ordered presentation of the search results will present the most interesting (to that querier) documents first and less interesting documents later. Ordering can be important, as search results are often numerous enough that not all of the deemed relevant documents are presented in an initial display. The querier might scroll or page down to see more results, but is not likely to be interested in the results if it requires much scrolling/paging to find documents of interest.
What is needed is an improved search apparatus and method for generating search results and ordering them for user presentation, taking into account the nature of the query.