1. Field of the Invention
Embodiments of the invention relate to search indexing. More specifically, embodiments of the invention relate to a parallel segmented index that supports both incremental document indexing and incremental term indexing.
2. Description of the Related Art
A search engine may use a search index to identify and return documents responsive to a search query, which may include one or more search terms. The search index (or simply index) may be generated over an entire corpus of documents and may improve the efficiency with which relevant documents (i.e. of the corpus) are identified for a search query. For example, the search index may provide a mapping from indexed terms to each document that includes a given term. In one embodiment, the search index may also provide a mapping from a document to terms included in that document. If a document is added to the corpus, the index may need to be modified to accommodate the new document. Modifying a large index may be costly in terms of computation time and resources.
However, a search index may be designed to support incrementally indexing a document (i.e., without modifying the entire search index). For example, a search index may be divided into one or more segments. Each segment may index a subset of the corpus. Thus, the search index may add a new segment to include a new document without modifying other (existing) segments. By limiting the size of a segment, the search index may include new documents at an improved rate.