1. Technical Field
This invention generally relates to a database management system performed by computers, and more specifically relates to the optimization of structured query language (SQL) queries using an encoded vector index (EVI) to process a DISTINCT function.
2. Background Art
An index in a book facilitates locating information on a specific topic quickly and without blindly paging through the book. Database indexes provide similar benefits by providing a method to quickly locate data of interest. Without an index, a database performs a full table scan, blindly searching through every row in a database table until the target data is located. Thus, depending upon where data resides in a database table, such a scan can be a lengthy and inefficient process.
Indexed scans of database tables are more efficient than full table scans since the length of database index entries are in most cases shorter than the database table entries. Shorter entries mean that more index entries can be stored in a single computer page. Indexed scans can therefore result in a considerable reduction in the total number of computer pages that must be processed in order to locate the requested data.
While indexed scans of database tables can improve performance, the complexity of the data being scanned and of the nature of the database query still determine how effectively a query can be implemented. Different queries place differing levels of processing demands on the database in unique ways. As a result, different index types are needed to cope with a users"" ever-changing workloads. One type of index is the encoded vector index (EVI), disclosed U.S. Pat. No. 5,706,495, Chadha et al., Jan. 6, 1998, Encoded-Vector Indices For Decision Support and Warehousing (hereinafter xe2x80x9cChadhaxe2x80x9d), which is incorporated by reference.
An encoded vector index (EVI) is a variation of the bitmap index concept. A bitmap index indicates whether a specific value exists for each row in a particular column. One bit represents each row. Thus, in the bitmap index for the value xe2x80x9cMNxe2x80x9d in the column xe2x80x9cLOCATION,xe2x80x9d the nth bit equals 1 if the nth row of the data table contains xe2x80x9cLOCATIONxe2x80x9d=xe2x80x9cMN,xe2x80x9d or 0 if that row holds a value other than xe2x80x9cMN.xe2x80x9d An EVI serves a similar purpose, but only one index is necessary to account for all the values occurring in the column (whether they be xe2x80x9cNY,xe2x80x9d xe2x80x9cMN,xe2x80x9d or any other). So in an EVI on the xe2x80x9cLOCATIONxe2x80x9d column, the nth position of the EVI contains a bit code that identifies the value of xe2x80x9cLOCATIONxe2x80x9d in the nth row of the table. Thus, whereas a separate bitmap index is required to map each particular key value in a database field, only one EVI is required to represent the same information. Thus, an EVI saves computer memory by including all possible key values for a given field in one database index.
Chadha discloses a method to efficiently scan relational database information by performing bit-vector operations on EVI""s, instead of performing analogous operations on the relational database table itself. However, the usage of and dependence on relational database tables has increased dramatically during the past decade, and continues to increase. Thus, new ways to use database tools, such as EVI""s, are needed in order to continue to provide significant improvements in query performance; otherwise, database users will be hampered in their ability to maximize intelligent information retrieval.
According to the present invention, an apparatus and method to significantly improve performance of the SQL DISTINCT function processing through the use of an encoded vector index (EVI) is disclosed. An EVI provides the data necessary to generate query results for SQL DISTINCT functions that specify one or more database fields upon which the EVI is built. Sequentially scanning through the EVI symbol table, each unique value or combination of values in the one or more database fields specified in the DISTINCT function are returned to provide the requested query results. By processing the EVI symbol table in lieu of more traditional database indexes, and/or in lieu of the database table itself, query results for SQL DISTINCT functions are generated significantly faster.
The foregoing and other features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings.