The invention relates generally to computer databases and, more particularly, to an interface between a relational database application program and a database management system which includes locking and commit facilities.
Databases are computerized information storage and retrieval systems. A database manager, also known as a database management system (DBMS) or a relational database management system (RDBMS), is a complex and sophisticated computer program that provides a variety of tools for defining, manipulating data in, controlling access to and otherwise managing the database in a variety of ways. The database management system stores and retrieves data as database records organized into tables which consist (conceptually) of rows and columns of data. Conventionally, database records are accessed a row at a time.
It is quite possible for a database management system to access database records in response to commands typed one at a time by individual on-line users accessing the database from terminals, which typically are remotely distributed. However, embodiments of the invention are concerned with batch mode execution of database application programs that include programming statements (program code) written in a language such as SQL (Structured Query Language). As a matter of convenience, such program statements are sometimes referred to hereinbelow and in the accompanying drawings as SQL statements, and include statements to create, modify and delete database records in accordance with the intended purpose of the application program.
A significant characteristic of computer operating systems and database management systems with which the invention is concerned is that they are multitasking. Thus, more than one on-line user and/or more than one batch-mode application program can interact with the database through the database management system at the same time.
A particular relational database management system in combination with which the invention may be employed is known as DB2, a product of International Business Machines Corporation (IBM). DB2 runs on mainframe computers under operating systems such as OS/390. Versions are also available for various personal computer operating systems. The invention, however, is not in any way limited to use in combination with DB2, and may, for example, be employed in combination with other relational database management systems such as Oracle, Sybase, Informix and SQL Server.
Historically, an IBM software product named Time Sharing Option (TSO) was employed as part of a computer operating system to enable batch mode database application programs to access the records of a database managed by DB2. DB2 itself had no batch mode interface.
In view of various limitations associated with TSO, other approaches have been developed to run DB2 database application programs in batch mode, such as the approach in a product known as Database Attach™, marketed by Softbase Systems, Inc., of Asheville, N.C. (url http://www.softbase.com) The Database Attach™ product is an interface between application programs and the Call Attachment Facility of the DB2 database management program, which interface does not require the TSO environment. Database Attach™ handles execution of SQL statements, passing them on to the DB2 Call Attachment Facility. An ENQ feature of Database Attach™ that virtually eliminates deadlock/timeout conditions between competing batch-mode applications is disclosed in Blair U.S. Pat. No. 5,369,764.
An essential capability of a practical, multi-user database management system, such as DB2, is to control concurrency. Thus, in an environment where more than one application process or user is accessing the same database, there is always a danger that the actions of one application process or user can interfere with those of another, unless suitable controls are in effect. Preventing essentially simultaneous transactions from interfering with each other, in other words, controlling concurrency, means making sure that data is not seen or operated on by another user or batch application program, until a pending change is complete. Accordingly, relational database management systems include a locking facility which “locks” database records with pending changes on behalf of one user or process to prevent access by another user or process.
A classic example involves the sale of seats on an airplane. Two potential customers, at different locations, may be accessing (either directly or through an agent) a database which indicates that a particular seat is available. If both customers then decide to purchase a ticket for the seat, the possibility exists that an update to the database on behalf of the first customer indicating that the seat has been sold to the first customer may be overwritten by an update to the database on behalf of the second customer indicating that the seat has been sold to the second customer. More specifically, in this situation (which should not be allowed to occur in a practical system) the second customer has been allowed to update the database on the basis of information that is no longer current, instead of being forced to see the most current information.
As another example, a banking transaction might involve the transfer of funds from one account to another. Such a transaction would require that the funds be subtracted from the first account, then added to the second. Following the subtraction step, the data is inconsistent. Only after the funds have been added to the second account is consistency reestablished. Only when both steps are complete, should the records involved in the transaction be accessible to other application processes.
There are of course many variations of situations where such undesirable results can occur when database records are updated, unless some form of concurrency control is implemented, such as locking.
Thus, database application programs (comprising SQL statements) are organized to effect units of work, which may also be termed transactions. Transaction management means ensuring that a set of SQL statements is treated as a unit; transaction management guarantees either that all of the SQL statements within a unit of work are completed, or that none is completed. At the beginning of execution of a sequence of program statements that define a unit of work, the locking facility locks database records with pending changes to prevent access by any other program or user. Each logical unit of work ends with a commit statement, and the database management system includes a corresponding commit facility which effects a commit operation to commit pending database changes, and release locks. (Alternatively, a rollback operation may be performed prior to commit, thus backing out of pending database changes.) Once committed, database changes are accessible by other application programs and users, and can no longer be backed out by a rollback. A unit of work may also be described as a recoverable sequence of program statements executed by the database management system for an application program. (A broader term, which encompasses other recoverable resources used by the application program, in addition to relational databases, is a unit of recovery. In the context of the invention, units of work and unit of recovery have the same meaning.) In the banking transaction example above, a unit of work begins (and locks are established) when SQL statements to effect the transaction begin to access the account records. The unit of work ends when both steps are complete, and a commit statement is executed. At any time, an application process has a single unit of work, but the life of an application process can involve many units of work as a result of commit or rollback operations.
To avoid unduly causing contention with (i.e. locking out) other users and application processes, a properly written database application program is organized as a plurality of logical units of work, each ending with a commit statement.
Correspondingly, if another application program or user attempts to access a database record which has been locked, that other database application program or user is required to wait until the first application program or user has finished its unit of work or transaction, and has allowed the locks to be released.
Many applications are designed to commit at frequent intervals during execution. An example is a batch application designed to perform updates to database tables that are shared by an online application and where it is desired to reduce online contention.
A drawback of locking, particularly where an application program is designed to commit at frequent intervals during execution, relates to the efficient use of computer resources. Significant processing overhead is associated with a commit operation, which accordingly is relatively time consuming. For example, when a commit operation is performed, data which has been held in relatively fast random access memory (RAM) is externalized by being written to disc, a much slower operation than accessing random access memory. An application program could be issuing commit statements hundreds or even thousands of times per second, requiring the database management system to spend a relatively large percentage of its time performing commit operations.
Thus, there are times when it is desirable to reduce the frequency at which commit operations are performed on behalf of an application program, thereby reducing the amount of time required for the application to execute.
Referring again to an example above, a batch application may have been designed to perform updates to database records that are shared by on-line applications. The batch application program, in an attempt to reduce contention, would be programmed so as to commit frequently. However, there may be times when the batch application is executed during which contention is not a major concern.
In prior versions of the Database Attach™ product, a “variable commit frequency” feature is implemented, which accepts a COMMITFREQ parameter. The COMMITFREQ parameter acts as a frequency divider value by which the frequency at which commit statements are issued during program execution is divided to determine a reduced frequency at which commit operations are actually performed by the commit facility. As an example, a COMMITFREQ parameter value of ten might be employed. Accordingly only every tenth commit statement issued by the application program is passed on to the database management system, and the other nine are suppressed. As a result, application program execution is speeded up (but at the expense of potential increased contention as viewed by other application processes or users).