A search engine is generally part of an information retrieval system that retrieves information from a data repository according to a search query provided by a user. A search query can be expressed as one or more keywords. In response to the keywords, a search engine can generate a list of documents that contain the keywords.
To increase the speed of a search engine, documents in the data repository can be indexed prior to the search. The indexing operation collects, parses, and stores data relating to the documents in a search index to facilitate fast and accurate document retrieval.
A typical document retrieval system includes a front end component and a back end component. The front end component includes one or more client applications (e.g., web browsers and/or special purpose client applications) that generate search queries and transfer these search queries to the back end component. The back end component processes the search queries, looks up a search index, and accesses a data repository to retrieve the requested document.
In some enterprise document retrieval systems, the search index may contain a large amount of data that cannot be easily managed by a single server. Thus, it becomes crucial to be able to distribute this data across multiple storage locations that are managed by multiple servers in an efficient manner.