A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
The present invention relates generally to information processing environments and, more particularly, to retrieval and processing of information stored in a data processing system, such as a Database Management System (DBMS).
Computers are very powerful tools for storing and providing access to vast amounts of information. Computer databases are a common mechanism for storing information on computer systems while providing easy access to users. A typical database is an organized collection of related information stored as xe2x80x9crecordsxe2x80x9d having xe2x80x9cfieldsxe2x80x9d of information. As an example, a database of employees may have a record for each employee where each record contains fields designating specifics about the employee, such as name, home address, salary, and the like.
Between the actual physical database itself (i.e., the data actually stored on a storage device) and the users of the system, a database management system or DBMS is typically provided as a software cushion or layer. In essence, the DBMS shields the database user from knowing or even caring about underlying hardware-level details. Typically, all requests from users for access to the data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, and so forth, all without user knowledge of underlying system implementation. In this manner, the DBMS provides users with a conceptual view of the database that is removed from the hardware level. The general construction and operation of a database management system is known in the art. See e.g., Date, C., An Introduction to Database Systems, Volume I and II, Addison Wesley, 1990; the disclosure of which is hereby incorporated by reference.
DBMS systems have long since moved from a centralized mainframe environment to a de-centralized or distributed environment. One or more PC xe2x80x9cclientxe2x80x9d systems, for instance, may be connected via a network to one or more server-based database systems (SQL database server). Commercial examples of these xe2x80x9cclient/serverxe2x80x9d systems include Powersoft(trademark) clients connected to one or more Sybase Adaptive Server(trademark) database servers. Both Powersoft(trademark) and Sybase Adaptive Server(trademark) are available from Sybase, Inc. of Emeryville, Calif.
As the migration to client/server continues, each day more and more businesses are run from mission-critical systems which store information on server-based SQL database systems, such as Sybase Adaptive Server(trademark). As a result, increasingly higher demands are being placed on server-based SQL database systems to provide enterprise-wide decision support. Accordingly, there is much interest in improving the performance of such system, particularly in terms of execution speed and reliability.
The present invention provides a methodology for using look ahead lists to significantly reduce latching costs associated with index pages in non-covered index scans, without the necessity of using a shadow page. (For covered queries, the datarow is not retrieved; all columns needed by the queryproc (query processing module) are provided from the index row.) In many cases, the copying costs are eliminated too.
In database systems, a xe2x80x9cB+-treexe2x80x9d is a common indexing scheme that is employed. A primary B+-tree has data records at the leaf row, while a secondary B+-tree has index keys, where a key consists of a key value and a ROWID of the data record. The index manager of a database system typically provides the following interface methods to facilitate scans:
(1) startscan( ): to start a scan on an index, optionally with some search arguments (or SARGs), which are simple conditions that can be expressed in the form of column  less than relop greater than  value, where relop can be  less than ,  greater than ,  less than =,  greater than =, !=or =and column is a column in the index. If the index is on xe2x80x98namexe2x80x99 column of an Employee table, name=xe2x80x9cJoexe2x80x9d is an example of a SARG.
(2) getnext( ): to retrieve ROWID from the next qualifying index key. This ROWID is used by the system""s data layer to retrieve the data record.
(3) endscan( ): to terminate the scan and clean up resources.
A conventional approach for implementing getnext( ) would be as follows:
However, in the above approach, there is a latch/unlatch operation for every qualifying key. If there are 100 qualifying keys on a page, the latched/unlatched cycle occurs 100 times.
Some systems solve this problem by using a shadow page. In that scheme, when a page is latched, all qualifying keys are copied to shadow page. Subsequently, the rows are returned from shadow page, avoiding the latch requests on the original page. However, the shadow page is a memory overhead. The present invention, in contrast, provides a methodology that solves this problem without using a shadow page. In addition, it also saves the cost of copying qualifying keys in some cases.
The preferred approach may be summarized as follows.
(1) The database system uses a xe2x80x9clook aheadxe2x80x9d list (LA) to save up to a preset number (e.g., 60) of ROWIDs (row identifiers) from the qualifying index keys. Since a ROWID is typically small (e.g., about 6 bytes), the space needed to store 60 ROWIDs is typically much smaller than the underlying page size (which typically range from 2K to 32K in systems). Thus, in step 3 above, instead of finding just one next qualifying key, up to a selected number (e.g., 60) of qualifying keys on the current page are found and the ROWIDs from them are stored in the look ahead list.
(2) In addition to space needed for look ahead list, the approach uses space to store one index key.
The scan is considered to be in NOCOPY mode if it satisfies the following conditions:
1. The scan is not for a cursor statement, AND
2. The scan is required to hold a lock on its current row, AND
3. The scan is not for an update statement which is modifying one of the index columns.
Otherwise, it is considered to be in DCOPY mode.
If the scan is in NOCOPY mode, then copying every qualifying key is not needed. In such a case, STEP 5 is not needed. If a restart is needed, i.e., in STEP 2A, the index key is built from the data record which corresponds to last ROWID returned. On the other hand, if the scan is in DCOPY mode, then STEP 5 is omitted if latch is not being held on the index page. Instead, the index key is built from the data record and copied while holding the latch on data page. If the data layer finds that the data record does not qualify, then there is no need to build or copy the index key.
The following definitions are provided for purposes of illustration.