In the prior art, a well-known process that is executed periodically for DB2 installations is to check the indexes of a database. Generally speaking, this entails ensuring that the physical data records (sometimes referred to as rows) are properly indexed. That is, for each data row should have exactly one index entry (e.g., "Last Name" for a personnel data record), and that the key value, in the physical data record or row matches the key value as stored in the appropriate index entry for that data row.
Some nomenclature is introduced here to aid in understanding the terminology used. A principal term used here is that of a "data set." Generally speaking, a data set is a collection of data that is referred to in an operating system environment (e.g., the well-known MVS operating system environment) by a single name, in much the same way as a word processing file might be given a single directory name for easy retrieval of the data in the file (even though the data might physically be stored in a variety of locations on a disk). Examples of MVS data sets include DB2 indexes and DB2 table spaces. Typically, maintenance operations in the DB2 environment involve three steps: First, reading the data to be reorganized (e.g., from a table space or index), often from a variety of different physical locations identified by a data set name. The step of data reading is typically referred to as an UNLOAD process that involves physically copying the data to some other memory or other storage. The second step is that of sorting or otherwise ordering the data to conform to the desired ordering and performing any other desired processing. Finally, the third step is that of rewriting the sorted data to storage (table space or index) designated with the same data set name. The third step is typically referred to as a RELOAD process.
A conventional way of checking DB2 indexes is illustrated in FIGS. 1 and 2. One or more DB2 indexes 100 are UNLOADed in a first processing phase 105, which typically reads the index entries and sorts them into the order of the physical sequence of the rows indexed. A work or intermediate data set 110 is created from the output of the first processing phase 105. In a second processing phase 115, the contents of the work data set 110 are methodically compared with the actual physical data records read from DB2 table space(s) 120 to check the index integrity as described above, and appropriate diagnostic message(s) 125 are displayed. This process is shown in more detail in FIG. 2. The index(es) in question are UNLOADed sequentially at block 200. At block 210, the index(es) are sorted by the data row identifier (sometimes referred to as the row ID or RID). At block 220, the physical data records are read, and at block 230, the respective data records are compared with the corresponding index key values from the index-related work data set 110. At block 240, appropriate conventional diagnostic routines are executed to check the index integrity.
A significant disadvantage of the foregoing prior-art approach to the CHECK INDEX operation is that the various indexes are UNLOADed one after another, and subsequently the physical data records are read. The elapsed time required for the operation thus includes the sum of the elapsed times required for each of the index UNLOAD operations plus the time required for the reading of the associated physical data records. This is especially undesirable in installations which attempt to operate 24 hours a day, seven days a week; it may be highly undesirable to disable a data set, in effect, for the length of time required for a two-phase CHECK INDEX operation. Moreover, such an approach plainly requires additional input/output (I/O) steps, which can have tangible financial costs associated therewith. In some actual systems, storage space may be at a premium and thus the use of storage to create a working or intermediate data set may be undesirable.