1. Field of Use
The present invention relates to data processing systems and more particularly to database management systems.
2. Prior Art
Typically, today""s enterprise or legacy systems store large quantities of data in database systems accessed by database management system (DBMS) software. In such database systems, data is logically organized into relations or tables wherein each relation can be viewed as a table where each row is a tuple and each column is a component of the relation designating an attribute. It has become quite common to use relational database management systems (RDMS) for enabling users to enter queries derived from a database query language, such as SQL, into the database in order to obtain or extract requested data.
In compiling type database management systems, an application program containing database queries is processed for compilation prior to run time. This can be done and more frequently is done at run time by users of the INTEREL product discussed herein. Users of other database products such as DB2, do such processing prior to run time.
During compilation, database queries are passed to the database management system for compilation by a database management system compiler. The compiler translates the queries contained in the application program into machine language. Generally, a database compiler component referred to, as a query optimizer is included in the database management system to select the manner in which queries will be processed. The reason is because most users do not input queries in formats that suggest the most efficient way for the database management system to address the query. The query optimizer component analyzes how best to conduct the user""s query of the database in terms of optimum speed in accessing the requested data. That is, the optimizer typically transforms a user query into an equivalent query that can be computed more efficiently. This operation is performed at compile time, in advance of execution.
A major component of the RDBMS is the database services component or module that supports the functions of SQL language, such as definition, access control, retrieval and update of user and system data. Such components may utilize a multilayer structure containing submodules or components for carrying out the required functions. For example, one such system includes a series of components or conceptually, a series of layers for carrying out the required functions for accessing data from the relational database. More specifically, a first layer functions as a SQL director component that handles requests at the interface to the requesting or calling application program. A second layer consists of two major components, an optimizer for optimizing the query and a RAM code generation component. The optimizer processes the query by for example, by determining the appropriate access plan strategy. The code generation component (Codgen) generates code according to such plan for accessing and processing the requested data. The access plan defines the type of access to each table, order of access, whether any sorts or joins are performed along with other related information.
The generated code is passed to a third layer that functions as a relational file manager (RFM) component. This component layer performs the relational file processing function of translating the code-generated requests into IO file read/write requests. A fourth layer that functions as an IO Controller performs the requested I/O operation designated by such IO file requests that results in reading/writing the relational database files in page increments. The described architecture is characteristic of the INTEREL product developed and marketed by Bull HN Information Systems Inc. For information concerning this product, reference may be made to the publication entitled, xe2x80x9cDatabase Products INTEREL Reference Manual INTEREL Performance Guidelines, Copyright, 1996 by Bull HN Information as Systems Inc., Order No. LZ93 Rev01B.
It was noted that index searches are very common operations in relational databases. They occur when processing SELECT, UPDATE or DELETE statements. Because they occur so frequently, it is an area where performance improvement could result in a substantial benefit for a relational database system such as the above architecture. For example, the above architecture processes the following typical query as follows:
Select accountID from tableFunds where accountBalance greater than =50000;
With DDL:
create table tableFunds (accountld int, accountBalance numeric(11,0)
create index i1 on tableFunds (accountBalance);
After an examination of this query, it is seen that many accounts may qualify. Because the index is on the accountBalance column and index data is sorted order, the query would be processed in the following manner. First, the index entry for 50,000 must be retrieved, then the subsequent index entries (i.e., all for accountBalances of 50,000 or greater) must be retrieved until the index is exhausted.
The above architecture performs index processing in the following two steps. First, the Codegen component layer calls the RFM component layer to search for a specific index value. This is called a Find Index search which is used to locate the database key (DBK) of a record from an index key value provided by the caller (i.e. user""s search request). Once the RFM component layer finds the index entry, it establishes a currency to it. This currency is control information that indicates which fine level index entry corresponds with the search request. This currency information is stored in the RFM component layer""s schema structure. The RFM component layer establishes a currency ID for the currency from currency ID information that the code generation component layer sets in an RFM data structure (RFM_XPT) prior to the call.
In the second step, the Codegen component layer calls the RFM component layer to return the next index entry (i.e., Search Next Index entry). The index fine level entries are in sorted order and because the Codegen layer passes in the currency ID from the prior search, the RFM layer can go to the currency information stored in the schema structure and use it to find the next index entry without repeating the index search. After the RFM layer has identified the next index entry, it updates the currency information in the schema. This second step is repeated until query processing is complete. This process is quite time consuming in terms of the overhead expended in invoking/calling lower component layers to perform index processing.
Accordingly, it is a primary object of the present invention to provide a more efficient method and system for improving relational index processing.
The above objects are achieved in a preferred embodiment of the present invention that can be utilized in a relational database management System (RDMS) that implements the Structured Query Language (SQL) standard. The present invention is a system and method that enhances the index processing performance of a multi-layer relational database manager. According to the teachings of the invention, the code generation component layer of the database manager includes an index processing performance enhancing subroutine designed to execute functions performed by lower component layers substantially faster than if the functions were executed by such lower component layers. The output code generated by the code generation component layer includes calls to the index processing performance enhancing subroutine thereby incorporating such subroutine into the output code.
The subroutine includes logic for establishing the conditions under which the particular subroutine is invoked during the execution of a SQL request. In the preferred embodiment, the logic detects when there is more than one search next index operation is requested for a particular query. On the second search next index request, the subroutine logic examines a fine level index page (CI) from which the prior index entry was retrieved. This CI was obtained by RFM during the initial B Tree index search and resides in the buffer pool. If there have been no changes in index currency and in the fine level index, then the subroutine copies the fine level index entry to the requestor""s key buffer along with a database key value. Also, the currency information is updated to point to the next fine level index entry.
When the logic detects the presence of above conditions, this eliminates having to make calls to the lower component layers (i.e., RFM and IO component layers), thus bypassing these layers. This results in substantial increase in performance.
It will be appreciated that not all sequential index searches processed by the output code and that the enhanced index processing subroutine will be able to bypass the lower component layers. That is, it is still necessary to call the layer that performs I/O operations (i.e., IO component layer) while doing a sequential index search if more than one result is required to be processed. This occurs in the case where a result is returned in response to a SELECT statement used in conjunction with a FETCH cursor statement and a subsequent query issues an index search to an identical index in another database (model).
To protect against any possible page (CI) integrity problem, the IO component layer must be called to refresh the fine level index CI pointer once the first or original search request is resumed. But, even for the case where the IO component layer must be called, performance is still greatly enhanced by bypassing the RFM component layer. According to the present invention, at code generation time, the code generation component layer places into the output code generated for a particular query, the control logic for determining when a result has been processed and for setting a result processed flag indicator. Those types of queries that do not return a result or return only a 1 result will not have the code generated for them that sets the result processed flag. Queries that fall into this category are DELETE, UPDATE and SELECT INTO (e.g. SELECT COUNT(*)).
The above objects and advantages of the present invention will be better understood from the following description when taken in conjunction with the accompanying drawings.