This invention relates to resource search and discovery systems that operate in a distributed enterprise system, such as a computer network or an intranet. These systems are conventionally called enterprise search systems (ESSs). Such a system might be available to users working in the enterprise or might be used as a search tool available to users outside of the organization via a mechanism such as a company website.
Currently, ESSs are centralized in that there is typically one application that is responsible for collecting content from the enterprise network or intranet. This application is commonly known as a “spider” or “robot” and locates documents. An indexer then indexes the content of those documents. Subsequently, other applications allow users to query the index, for example, via a web-based query interface.
There are a number of problems that such a centralized approach creates. First, in the general case, the people responsible for the administration of the ESS are not the people who are creating the content. This means that an administrator cannot easily tell whether a particular document should be included in the index or not. Thus, the administrators of a centralized system tend to spend most of their time making sure that the machines stay running, that the spider does not run out of control or become hung, and that search results for common queries are relevant.
Second, the content of the index in the ESS can rapidly become out-of-date with respect to the actual documents available on the system. Documents that are removed from the system result in “dead links” in search results, which leads to frustration for searchers. At the same time, new content is not included in the index immediately: it must wait until it is located by the spider. This delay can lead to duplication of intellectual effort if, for example, a problem must be re-solved.
Third, because people cannot find the information that they need, local search systems start to appear. For example, there are currently many desktop search engines available. A desktop search engine is a program that operates a desktop to index personal content such as e-mail messages, visited web pages, and local documents in a variety of formats. Such a local search system could be used, for example, to search an archive of e-mail messages sent to a number of aliases related to a given project. Other local search systems may involve a search engine running on a server shared by a group of users. These local search systems locate more timely and up-to-date content, but move the burden of system administration to the people creating the content. Furthermore, because local search systems often use differing technologies, such systems lead to a proliferation of search technologies within the enterprise. Attempting to discover the existence of these local systems and then to reconcile the search results produced by a number of different search engines is a difficult problem.
Fourth, people who have content that they would like to make available in the enterprise have no easy way to ensure that this content is included in the ESS. A typical strategy used to make sure that the information is available in the enterprise is to include the information on a web server, and then to attempt to make the spider visit that web server.