This invention relates to using indexes to retrieve stored information.
Information in a relational database 10 (FIG. 1), for example, is stored as records 14a, 14b, . . . in tables 12a, 12b, . . . Each record contains data values for one or more fields 16a, 16b, . . . . 
In a database of financial accounts, a table called ACCOUNTS may have a record for each financial account. The fields of the ACCOUNTS table could include an account identifying number (ACCT_ID) and a tax reporting number (TRN). The database may have other tables, and relationships may be defined between fields of different tables.
One simple, but often inefficient way to find a record that pertains to a given ACCT_ID is to search through the table, record by record, for the one that has that ACCT_ID.
To avoid having to do a full record-by-record search, a database system typically allows a user to define an index 20 for the table. The index has a field 22a that contains values of a xe2x80x9ckeyxe2x80x9d field of the table, e.g., field 16a. The entries of the index may be organized in a way that makes it easy to find an index entry that has a certain key value. For example, the index entries may be sorted in the numerical order of the values in the key field.
Each index entry contains a key value and an associated locator 24 that points to the location of the table record that corresponds to the key value of the record. Once a desired field value is found in the index, the table record can be accessed quickly using the locator.
Although an index consumes additional storage space without adding any more information to what is already in the tables of the database, the use of an index for retrieval saves other computer resources because the index is faster to search. On the other hand, additional computer resources are required to deal with the index when new records are inserted or records are deleted in the table and may be required when records are updated. These operations on records may require changing the corresponding entry in one or more indexes.
As an example of an index search, if the index key field 22a corresponds to ACCT_ID numbers in field 16a of the table then a desired TRN in field 16b corresponding to a known ACCT_ID value can be quickly retrieved by searching the index for the entry that has the known ACCT_ID value and then accessing the table record that is identified by the locator found in that index record. Yet even this simple process can use a lot of computer resources if there is a high rate of database queries in which an index is searched for an ACCT_ID and the records are accessed to get the corresponding TRN.
Database systems allow a user to create a special kind of index, called a composite index, in which (in our example) the TRNs from the table appear in a second field of the index, as part of a composite key, together with the corresponding ACCT_ID key field. (Composite keys are also known as concatenated keys, compound keys, and multi-field keys.) This permits a simplified search process, called index-only searching, in which a TRN is retrieved directly from the index without having to access an underlying table record.
If the key value in each entry of an index is unique, it is possible to locate unambiguously a single record associated with a given value of the key. Database systems therefore allow a user to specify that an index have a key that is unique. The database system is capable of enforcing the uniqueness of the key but doing so costs computer resources.
In unique indexes, all fields of the index taken together determine uniqueness. In our example, if there were a unique index on the ACCT_ID and TRN fields, the entire combination would be analyzed in determining uniqueness.
The computer resources that must be expended to perform an index search of a database table may be reduced by storing, in the index, additional information (extra data) from, e.g., the associated table and refraining from using the extra data when searching. Such an index may be called an augmented index. Although the extra data, e.g., is not used for searching, it can be used to return data for an index-only search. The key and/or the extra data in an augmented index can contain more than one field.
A useful kind of augmented index has a unique key and is called a unique augmented index. In a unique augmented index, the uniqueness constraint is not enforced on the extra data but only on the unique keys portion of the index entry.
The invention is also useful with a conventional composite unique index which is redundant with a xe2x80x9csmallerxe2x80x9d unique index (i.e., the key fields of the smaller index are a proper subset of the key fields of the larger index). In earlier systems, the uniqueness constraint would be enforced on the composite index. Using the invention, the uniqueness constraint need only be enforced on the xe2x80x9csmallerxe2x80x9d unique index, which saves computer resources.
Thus, in general, in one aspect, the invention features a method for use in retrieving information from computer-stored records. An index of entries is provided that contains values that are keys for respective the stored records, the keys being used to reduce the time required to locate records. Additional information, included in the index of entries, is used for a purpose other than to reduce the time required to locate records. In connection with computer operations associated with the index, the additional information is treated in a manner different from the manner in which the keys are treated.
Implementations of the invention may include one or more of the following features. The index may be an index to a single table of a database or a join index to at least two tables. The index may be unique or non-unique. The keys may be used for locating records or checking uniqueness. The additional information may be derived from the records of the database (or other stored records) and may be used as data. The additional information may be stored in the index in a compressed form and may be of at least two different types.
In general, in another aspect, the invention features a method of forming an index for use in retrieving information from computer-stored records. As before, the entries contain values that are keys for respective ones of the stored records. At least some of the entries also contain additional information. Data is also stored that identifies the additional information as information that need not be treated as unique during computer operations associated with the index. In implementations of the invention, the data that is also stored may include a bit map pointing to fields that contain additional information or may include a value that indicates a number of fields that contain additional information.
In general, in another aspect the invention features providing a composite unique index of entries that contain values that are keys for respective ones of the stored records, the keys being used to reduce the time required to locate records, the key fields including a proper subset of key fields that are a smaller unique index of the entries. In connection with computer operations associated with the index, the key fields that are not part of the proper subset are treated in a manner different from the manner in which the key fields in the proper subset are treated.
Among the advantages of the invention are one or more of the following.
A unique augmented index has essentially the same size and can be maintained (after an insert, update, or delete) in essentially the same way and at essentially the same cost as a single corresponding conventional unique index. One augmented index can provide the functionality and performance of two (or more) conventional indexes. Thus the augmented index can render some conventional indexes superfluous, making it unnecessary to create or maintain them. Database operations are improved because less main memory is used. Fewer index entries are read during index-only access. Better query optimization is achieved because there are fewer indexes to consider. Database administration is easier because there are fewer indexes to create and manage. A unique augmented index can both enforce uniqueness and provide the improved performance of an index-only access. Less secondary memory (disk space) is used by one augmented index compared with multiple conventional indexes. Most database systems keep a main memory cache of frequently used database blocks to avoid reading them from secondary memory each time they are needed. Eliminating the superfluous index saves space in the cache.
Other advantages and features will become apparent from the following description and from the claims.