1. Field of Invention
The present invention relates generally to the field of indexing data. More specifically, the present invention is related to indexing data without suspending data manipulation.
2. Discussion of Prior Art
Current state-of-the-art-schemes for building a new index on data require that all insert, update, or delete operations (on the data) be suspended for the duration of the index build operation. The emphasis in such prior art schemes is on raw speed to reduce the amount of time the data is not available for manipulation.
FIG. 1 illustrates such a prior art scheme, wherein the scheme involves: (a) scanning a data set; (b) extracting an index key from each data record, (c) pairing the index key with a record identifier (also called a RID), (d) sorting the data by key and RID; and (e) building the index structure from these ordered key/RID pairs. In such prior art schemes, the amount of time in which the data is not available for manipulation may be considerable.
The references describe, in general, the procedure of indexing data. It should be noted that the prior art fails to teach, either directly or indirectly, the present invention's method for building indexes on data concurrent with the manipulation of data.
The U.S. patent to Ambroziac (U.S. Pat. No. 6,460,047 B1) provides for a technique for indexing data. The disclosed method describes compressing an index to obtain an index that is easily stored and transmitted. The disclosed invention also provides for the decompression of such a compressed index. One disclosed embodiment maintains a separate index for each document, thereby allowing for easy updating of indexes in response to changes in documents and easy transmission of indexes, which allows distributed searching. The claimed technique provides very compact indexing information, but allows the indexing information to be very rapidly processed.
The U.S. patent to Kirsch et al. (U.S. Pat. No. 5,920,854) provides for a real-time document collection search engine with phrase indexing. The disclosed collection search system is responsive to a user query against a collection of documents to provide a search report. The collection search system includes a collection index, including: (a) a first predetermined single word and multiple word phrases as indexed terms occurring in the collection of documents, (b) a linguistic parser that identifies a list of search terms from a user query, the linguistic parser identifying the list from second predetermined single words and multiple word phrases, and (c) a search engine coupled to receive the list from the linguistic parser.
The U.S. patent to Agarwal et al. (U.S. Pat. No. 5,842,196) provides for a database system and method for updating records such as are commonly used in a relational database environment. Updates are carried out in a manner which allows a substantial portion of the work to be performed in direct mode (when possible), thereby avoiding the inefficiency of re-reading records. In this fashion, a scenario which requires deferred updating, in accordance with the present invention, can be treated mostly as a direct update with minimal deferred updating.
The U.S. patent to Reinsch et al. (U.S. Pat. No. 4,868,744) provides for a method for restarting a long-running, fault-tolerant operation in a transaction-oriented database system without burdening-the system log. A restartable load without logging method permits the restart of a LOAD operation from the last COMMIT point without requiring the writing of images of loaded records to the log. Instead, the method logs only a minimal amount of information, recording positions within the data sets to be loaded and within the tablespace being loaded.
The European patent to Fuller (EP0767435 A1) provides for a transaction device driver technique for a journaling file system. The transaction device driver logs any updates as the data appears through normal read/write/strategy entry points into the driver and, should the system fail while there are outstanding operations, the driver ensures that either all or none of the changes for the operation will appear in the file system.
Whatever the precise merits, features, and advantages of the above-cited references, none of them achieves or fulfills the purposes of the present invention.