To support document retrieval for large-scale content management solutions, full-text search solutions are available. For speedy response times and flexible search options, full-text search queries may rely on text indexes (also referred to as “text search indexes”) to access relevant data.
Populating a text index requires resource-intensive steps to parse documents, apply language-specific processing to the terms found in the documents, and to write the processed terms into an index. Processing a large number of documents may therefore take significant time and resources. To balance workloads it is therefore necessary to provide controls that enable a fine-tuning of the index processing.
One example is that with increasing demands on the capabilities of indexing (also referred to as “text indexing”), customers may need to re-create text indexes if a new text index solution does not provide backward compatibility to the previously used text search solution, for example, due to incompatible storage mechanisms or differences in index data structures.
Such a backward incompatibility requires a re-indexing of the document corpus, however, documents might already be archived in high-latency storage and only metadata kept online. The retrieval of documents in such scenarios may therefore add significantly to the duration of the index processing and may require multiple sessions.