The vast majority of documents we create and/or archive are stored electronically. In order to quickly find certain documents, the relevant data from these documents is typically extracted, catalogued, and organized in a centralized database to make them searchable. In some circumstances, these databases can be very large. For example, a law suit may involve over a million documents. Searching these large databases can be problematic.
Depending on the size of the document collection, indexing the documents can take hours or even days. Once an index has been built, the index needs to be maintained as documents are added and/or deleted from the database. However, these incremental builds leave the database inoperable. As a result, incremental builds are not performed very often, which leaves portions of the database inaccurate.