In a content retrieval system, a user makes a request for content and receives content matching that request. The user can be a human user interacting with a user interface of a computer that processes the requests and/or forwards the requests to other computer systems. The user could also be another computer process or system that generates the request programmatically. In the latter instance, it is likely that the requesting computer user will also programmatically process the results of the request, but it might instead be the case that a computer user makes a request and a human user is the ultimate recipient of the response, or even the opposite, where a human user makes a request and a computer user is the ultimate recipient of the response.
Content retrieval systems are in common use. One common system in use today is referred to as the Internet, a global internetwork of networks, wherein nodes of the network send requests to other nodes that might respond with content. One protocol usable for content requesting is the HyperText Transport Protocol (HTTP), wherein an HTTP client, such as a browser) makes a request for content referenced by a Uniform Resource Locator (URL) and an HTTP server responds to the requests by sending content specified by the URL. Of course, while this is a very common example, content retrieval is not so limited.
For example, networks other than the Internet might be used, such as token ring, WAP, overlay, point-to-point, proprietary networks, etc. Protocols other than HTTP might be used to request and transport content, such as SMTP, FTP, etc. and content might be specified by other than URL's. Portions of present invention are described with reference to the Internet, a global internetwork of networks in common usage today for a variety of applications, but it should be understood that references to the Internet can be substituted with references to variations of the basic concept of the Internet (e.g., intranets, virtual private networks, enclosed TCP/IP networks, etc.) as well as other forms of networks. It should also be understood that the present invention might operate entirely within one computer or one collection of computers, thus obviating the need for a network.
The content itself could be in many forms. For example, some content might be text, images, video, audio, animation, program code, data structures, formatted text, etc. For example, a user might request content that is a page having a news story (text) and an accompanying image, with links to other content (such as by formatting the content according to the HyperText Markup Language (HTML) is use at the time).
HTML is a common format used for pages or other content that is supplied from an HTTP server. HTML-formatted content might include links to other HTML content and a collection of content that references other content might be thought of as a document web, hence the name “World Wide Web” or “WWW” given to one example of a collection of HTML-formatted content. As that is a well-known construct, it is used in many examples herein, but it should be understood that unless otherwise specified, the concepts described by these examples are not limited to the WWW, HTML, HTTP, the Internet, etc.
In some instances, content is accessed in response to a request for a uniquely identified content object. For example, a user seeking to obtain the content of Yahoo !'s home page for the Yahoo! Sports property can initiate a web browser client and enter in the URL sports.yahoo.com in a dialog box provided by the web browser client for such purpose. In response to that request, the web browser client is programmed to make a request for the specified page to a particular server, which responds with the requested page, all as is well known to those familiar with request/response protocols such as HTTP and HTTPS.
In other instances, the user might not have a specific URL in mind and instead issues a more general request for content in the form of a search query. In a typical search query, the user is presented with a dialog box wherein the user enters search query terms and initiates a request based on those terms. One example of a search is a Yahoo! search. One way to perform a Yahoo! search is by directing a web browser client to the page with the URL www.yahoo.com and entering a search query in the search dialog box provided on that page. In response to such a query, which the web browser client sends to a www.yahoo.com server (or other server as directed by references contained in the page's HTML or other code), the receiving server in turn performs a search or causes a search to be performed and returns search results to the web browser client, usually in the form of a page or pages.
In one variation of a search and response currently in use, the user enters a string of one or more characters, typically in the form of one or more words or concepts (tokens) separated by delimiters, such as spaces or commas, and the search results are a page that contains several search hits organized by where they where found. For example, a search results page might list matching “Inside Yahoo!” hits, matching Yahoo! directory hits, matching sponsored hits, matching Web search hits, etc. It should be understood that “matching” can have different meanings in different search contexts. For example, in some search contexts, matching is exact and in other search contexts, matching is approximate, such as where singular forms and corresponding plural forms are considered matches.
Some searches are performed over all available documents, but other searches might be performed over one or more subdomain of documents available to be searched. For example, while all public Yahoo! properties might be available for a search, a search limited to the Yahoo! Travel property or the Yahoo! Sports property might be preferred. Often a user generating a query will know which subdomain to search and can so limit his or her search. However, this typically requires extra steps, such as navigating to a page associated with the particular subdomain and entering the search terms there.
One solution for subdomain searching is to provide a browser or other software with a search dialog box that processes searches based on search words that map to XML files indicating how to perform a search with the various pages associated with subdomains. For example, search strings beginning with “dic” would be processed by an XML file dic.xml that contains instructions on how the client should simulate the user entering in the remaining arguments into the search dialog box that would be provided on the page that is associated with the “dic” command. While this might work well for pages that do not change in structure, the pages used are typically not under the control of the client and the XML files are stored local to the client. Because of this, when the maintainer of, for example, the dictionary web site to which the dic.xml file is directed changes the structure of the page, the search might fail to operate properly, requiring each client to rewrite or update their XML instructions for accessing that changed page and simulating user entry of a search.
What is needed is an improved search using subdomains and other techniques.