The invention relates to distributed database management systems and to the management of savepoints within distributed database management systems.
A database is a storehouse of collected, recorded, and related data, with inherent value in the focus of the data elements, the logical coherency of data, the inherent meaning of the data, and the use to which the data can be put. Additionally, because of the database organization, the accessibility of records and files, and dynamic updatability of the data, the database has value far beyond the value of the sum of the individual elements of the database. The ability to routinely, repetitively, and consistently add data to the database, and to recover relevant, timely, correct data from the database, is a function of its database management system (DBMS). A DBMS is a collection of computer application programs that create the database organization (schema and metadata), and that store, retrieve, and update files in the database. Large databases are managed by complex database management systems (DBMS""s) that have mechanisms to assure the timeliness and validity of the data. The mission critical requirement of a DBMS is that the data be correct and be current.
Correct data, that is collected and reported in a timely fashion, is critical to a multi-user database. This is the reason for concurrency control. Concurrency control assures that several users trying to access and update the same data do so in a controlled manner so that the results of the updates are correct, and also assures that readers of the data see current and committed data, not stale or uncommitted data. A typical example is the assignment of individual airplane seats in an airline reservations database. Even with many reservationists assigning many seats on many flights, the concurrency control capability of the DBMS must assure that only one passenger is assigned to one seat, and, conversely, that one seat is assigned to only one passenger
1. DBMS Access Statements
Operations on database files include the xe2x80x9cData Manipulationxe2x80x9d operations of SELECT, INSERT, UPDATE, and DELETE, among others.
2. The SELECT operation retrieves data from a database, and returns the data to the requestor.
3. The INSERT operation enters data into a database.
4. The UPDATE operation changes the value of an existing record.
5. The DELETE operation deletes a record from the database.
2. Transaction Processingxe2x80x94Read and Write Operations of a Transaction
Those operations which change the file, as Insert, Delete, and Modify, require xe2x80x9cConcurrency Control.xe2x80x9d These operations typically follow a Read operation and are a part of a Write operation. Typically the Read operation, e.g., Read_Item(x), reads a database item named xe2x80x9cxxe2x80x9d into a program value in the end user""s program named xe2x80x9cxxe2x80x9d. The Write_Item(x) operation writes the value of the end user program variable, x, into the database item x.
3. Transaction Processingxe2x80x94Concurrency Control
Concurrency control assures that several end users seeking to modify (insert, update, delete) records substantially simultaneously do so in a controlled manner, so that the data stored in the database is correct. It also assures that readers see current and committed data, rather then stale data or uncommitted data.
4. Transaction Processing and Transaction States
There are various means of controlling concurrency in a single database. Most common, by way of example, is xe2x80x9clocking,xe2x80x9d so that when one user attempts to modify, update, or delete a record, no other user can access the record until the transaction is completed. A transaction may complete with commitment to the changes made by the transaction (COMMIT) or with reversal of the changes made by the transaction (transaction rollback). The decision to rollback a transaction may be that of the application program or may be a system decision (for example, in response to a condition preventing commit). xe2x80x9cCompletionxe2x80x9d is a non-specific term, and can mean writing the data, or going beyond writing the data to xe2x80x9ccommittingxe2x80x9d the transaction, before allowing another user access to the record. Concurrency control results in either (1) xe2x80x9ccommittingxe2x80x9d the transaction, or (2) aborting the transaction and xe2x80x9crolling backxe2x80x9d the database to a prior state.
In the context of concurrency control, and transaction logging, it is advantageous to break the model of items 1 and 2, above, into more granular levels or transaction states. Thus, the transactions can be broken down into the steps or transaction states of:
1. BEGIN_TRANSACTION which marks the beginning of a transaction.
2. SELECT or INSERT which specify read or write operations on database items as part of a database transaction.
3. END_TRANSACTION which specifies that the SELECT or INSERT operations have ended. These commands mark then end of transaction execution. It is at this point that the DBMS determines if the changes can be xe2x80x9ccommittedxe2x80x9d, that is, permanently applied.
4. COMMIT_TRANSACTION which signals a successful end of the transaction, the transaction will be committed to the database, and will not be undone.
5. ROLLBACK or ABORT which signals that the transaction has ended unsuccessfully, and must be rolled back, that is, the changes to the database must be undone.
5. Transaction Processingxe2x80x94System Log
When concurrency control fails, and a concurrency problem arises, it is necessary to xe2x80x9croll backxe2x80x9d the failed transactions. Transaction rollback is accomplished using the system log. The system log records transaction information as a transaction progresses. Typically the information includes a transaction ID, and before and after data images for every operation performed by a transaction. The progression of a transaction through stages of COMMIT or ROLLBACK are also recorded on the log.
Logging stops at the xe2x80x9cCommit Pointxe2x80x9d, which is when all operations that access the database have been executed successfully, and the effect of all of the transaction operations on the database have been recorded in the log or journal.
6. Transaction Processingxe2x80x94Recovery and Rollback
If a transaction fails to commit, the database must be rolled back to a previous state, before the failed transaction commenced. This is done by using the journal or log entries to reverse some of the changes in the database, and redo other changes in the database.
7. Transaction Processingxe2x80x94Recovery and Rollbackxe2x80x94Savepoints
Savepoints are like bookmarks within a transaction. They may be created by an application between the beginning and end of a transaction. Savepoints allow modifications made to data and schemas since the setting of the savepoint to be undone via a request by the application to xe2x80x9crollback to savepointxe2x80x9d. Savepoints make a set of non-atomic database requests or sub-transactions of an atomic transaction behave atomically, in that they are reversible as a unit. If an application error occurs during execution, or if the application detects an error, the savepoint can be used to undo changes made by the transaction between the time the savepoint was started and the time the savepoint rollback is requested, and return to a prior point in the transaction.
A savepoint is similar to a compound SQL statement, in that it allows grouping several SQL statements into a single executable block. Before the first statement of the block is executed, a savepoint request to start a savepoint block is required. If any of the subsequent statements end in an application error, the DBMS will roll back only that statement. This provides more granularity than a compound SQL statement, in which a single error anywhere in the block causes the entire block to end in an error and rolls back the entire compound SQL statement. At the end of a savepoint block of statements, one can either release the savepoint, or rollback to the savepoint.
Application savepoints provide control over the work performed by a subset of SQL statements in a transaction or unit of work. Within an application one can set a savepoint, and later either release the savepoint or roll back the work performed since the savepoint. It is also possible to use multiple savepoints and nested savepoints within a single transaction. The following code segment demonstrates the use of two savepoints within a single transaction to control the behavior of an application:
INSERT INTO order . . .
INSERT INTO order_item . . . lampxe2x80x94set the first savepoint in the transaction
SAVEPOINT before_radio ON ROLLBACK RETAIN CURSORS
INSERT INTO order_item . . . Radio
INSERT INTO order_item . . . Power Cord
Pseudo-SQL:
IF SQLSTATE=xe2x80x9cNo Power Cordxe2x80x9d
ROLLBACK TO SAVEPOINT before_radio
RELEASE SAVEPOINT before_radioxe2x80x94set the second savepoint in the transaction
SAVEPOINT before_checkout ON ROLLBACK RETAIN CURSORS
INSERT INTO order . . . Approval
Pseudo-SQL:
IF SQLSTATE=xe2x80x9cNo approvalxe2x80x9d
ROLLBACK TO SAVEPOINT before_checkout commit the transaction, which releases the savepoint
COMMIT
In this code segment, the first savepoint enforces a dependency between two data objects where the dependency is not intrinsic to the objects themselves. One would not use referential integrity to describe the above relationship between radios and power cords since one can exist without the other. However, a seller would not want to ship the radio to the customer without a power cord. But, a seller also would not want to cancel the order of the lamp by rolling back the entire transaction because there are no power cords for the radio. Application savepoints provide the granular control that is needed to complete this order.
Savepoints gives a better performance and a cleaner application design than using multiple COMMIT and ROLLBACK statements. When a COMMIT statement is issued, the DBMS must do some extra work to commit the current transaction and start a new transaction. Savepoints allows breaking a transaction into smaller units or steps without the added overhead of multiple COMMIT statements. The following code segment demonstrates the performance penalty incurred by using multiple transactions instead of savepoints:
INSERT INTO order . . .
INSERT INTO order_item . . . lamp commit current transaction, start new transaction
COMMIT INSERT INTO order_item . . . Radio
INSERT INTO order_item . . . Power Cord
Pseudo-SQL:
IF SQLSTATE=xe2x80x9cNo Power Cordxe2x80x9droll back current transaction, start new transaction
ROLLBACK
ELSE commit current transaction, start new transaction
COMMIT
INSERT INTO order . . . Approval
Pseudo-SQL:
IF SQLSTATE=xe2x80x9cNo approvalxe2x80x9droll back current transaction, start new transaction
ROLLBACK
ELSE commit current transaction, start new transaction COMMIT
Another drawback of multiple commit points is that an object might be committed and therefore visible to other applications before it is fully completed. In example 2, the order is available to another user before all the items have been added, and worse, before it has been approved. Using application savepoints avoids this exposure to xe2x80x98dirty dataxe2x80x99 while providing granular control over an operation.
In comparing application savepoints to compound SQL blocks, savepoints offer the following advantages over compound SQL blocks: enhanced control of transactions; less locking contention; and improved integration with application logic.
Compound SQL blocks can either be ATOMIC or NOT ATOMIC. If a statement within an ATOMIC compound SQL block fails, the entire compound SQL block is rolled back. If a statement within a NOT ATOMIC compound SQL block fails, the commit or roll back of the transaction, including the entire compound SQL block, is controlled by the application. In comparison, if a statement within the scope of a savepoint fails, the application can roll back all of the statements in the scope of the savepoint, but commit the work performed by statements outside of the scope of the savepoint. This option is illustrated in the first code segment, if the work of the savepoint is rolled back, the work of the two INSERT statements before the savepoint is committed. Alternately, the application can commit the work performed by all of the statements in the transaction, including the statements within the scope of the savepoint.
When a user issues a compound SQL block, a DBMS, such as IBM DB2 Relational Database Management System, simultaneously acquires the locks needed for the entire compound SQL block of statements. When a user sets an application savepoint, the DBMS acquires locks as each statement in the scope of the savepoint is issued. The locking behavior of savepoints can lead to significantly less locking contention than compound SQL blocks, so unless an application requires the locking performed by compound SQL statements, it may be best to use savepoints.
Compound SQL blocks execute a complete set of statements as a single statement. An application cannot use control structures or functions to add statements to a compound SQL block. In comparison, when you set an application savepoint, the application can issue SQL statements within the scope of the savepoint by calling other application functions or methods, through control structures such as while loops, or with dynamic SQL statements. Application savepoints give a user the freedom to integrate SQL statements with application logic in an intuitive way. For example, in the following code segment, the application sets a savepoint and issues two INSERT statements within the scope of the savepoint. The application uses an IF statement that, when true, calls the function add_batteries( ). add_batteries( ) issues an SQL statement that in this context is included within the scope of the savepoint. Finally, the application either rolls back the work performed within the savepoint (including the SQL statement issued by the add_batteries( ) function), or commits the work performed in the entire transaction:
void add_batteries( )
{
the work performed by the following statementxe2x80x94is controlled by the savepoint set in main( ) INSERT INTO order_item . . . Batteries
} void main(int argc, char[ ] *argv)
{
INSERT INTO order . . .
INSERT INTO order_item . . . lamp set the first savepoint in the transaction
SAVEPOINT before_radio ON ROLLBACK RETAIN CURSORS
INSERT INTO order_item . . . Radio
INSERT INTO order_item . . . Power Cord
if (strcmp(Radio.power_source( ), xe2x80x9cAC/DCxe2x80x9d))
{
add_batteries( );
}
Pseudo-SQL:
IF SQLSTATE=xe2x80x9cNo Power Cordxe2x80x9d
ROLLBACK TO SAVEPOINT before_radio
COMMIT
}
Savepoints thus provide a high degree of granularity, even within atomic processes,
8. Distributed Databases
Distributed databases are becoming increasingly important. This is motivated by such concerns as the distributed nature of modem enterprises, increased reliability and maintainability of multiple servers, data sharing across the global or distributed enterprise, and improved performance when a large database is distributed over multiple sites, with a capability of processing local transactions locally. However, the decentralization of multiple servers supporting multiple, interacting DBMS""s and distributed databases requires data networking, data distribution and replication management functionality, execution strategies for queries and transaction and queries that span multiple servers, consistency strategies across multiple servers, and the ability to recover from local disturbances and communications failures.
xe2x80x9cDistributed Transactionsxe2x80x9d are transactions impacting records and/or files in multiple DBMSs of a Distributed Database DBMS.
9. Distributed Databasesxe2x80x94Data Fragmentation
A further aspect of distributed databases is data fragmentation. Data fragmentation arises from three strategies, storing some tuples in some servers and other tuples in other servers (horizontal fragmentation), storing some attributes in some servers and other attributes in other servers (vertical fragmentation), and storing some attributes of some tuples in one server, other attributes of the same tuples in another server, some attributes other tuples in still another server, and still other attributes of the other tuples in still another server (mixed fragmentation). The fragments are mapped in a fragmentation schema that defines all of the attributes of all of the fragments, and an allocation schema that maps fragments to sites.
10. Distributed Databasesxe2x80x94Concurrency Control and Recovery in Distributed Databases
As would be expected, regardless of the degree of replication between the nodes, multiple databases on single platforms, and distributed databases with multiple servers have complex concurrency problems. The concurrency control system of the DBMS must maintain concurrency between multiple database entries. It must deal with individual site failures as well as communications network failures. And, it must deal with commit procedures that are accessing database fragments and replications stored on multiple sites, where some sites or communications links may fail during the commit process. Various strategies are used to cope with commit failures in multiple and distributed database systems, including primary site techniques, primary copy techniques, and voting techniques.
11. Distributed Databasesxe2x80x94Savepoints and Recovery in Distributed Databases
U.S. Pat. No. 5,630,124 to Coyle et al., for System And Method For Assuring Atomicity Of Distributed Update Requests In A Parallel Database describes a method of managing distributed requests in a parallel multiple database management system. As described by Coyle et al., a savepoint that is normally visible to only one DBMS is extended to all of the parallel DBMS""s participating in the transaction. This allows an application to set a savepoint on one DBMS and rollback to the same savepoint on another DBMS. The savepoint is identified across networked connections with a message from a coordinating server to participating servers. This message creates a local savepoint on the participant. The participant sends a message back to the coordinating server indicating the success or failure of the execution of the underlying transaction, and, in the event of failure on a local server, a rollback of the transaction on the local server and the other servers.
Coyle et al describe how, responsive to the initiation of a transaction, a request counter for a coordination process is initialized. Upon each receipt of a request for distribution by the coordination process, the request counter is incremented. Request instances, including savepoint data, are generated for each request and distributed to selected subordinate processes on different nodes for the partitions of the database. Upon receipt of a request instance on a selected subordinate process, a local savepoint with the request savepoint data and locally generated savepoint data is stored for each selected subordinate process. When an attempt is made to execute the request instance, the success or failure of the execution attempt is returned to the coordination process. This is done for each request instance. On an indication of failure of execution for any request instance, a rollback is performed on each subordinate process on which the local savepoint indicates execution of a request instance for the request.
Because of the added levels of complexity arising from multiple server sites, communications networks, fragmentation, and replication, a need exists to provide a level of control that is more granular then the xe2x80x9ccommit-rollbackxe2x80x9d methodology. A need exists for a savepoint methodology that is functional with fragmented and replicated databases on multiple servers joined together over a telecommunications network.
The problem of implementing savepoints in distributed database systems is obviated by the method, system, and program product described herein, which provides savepoint functionality across fragmented and replicated database elements distributed over multiple servers and joined together by one or more communications networks.
The method, system, and program product of the invention provides for identifying distributed savepoints, messages containing savepoint data (savepoint name, the name of the server that created the savepoint, and the server""s savepoint sequence number) between servers participating in the transaction and a coordinator server. One message goes from a coordinator server to a participant server, identifying savepoint identifiers that have been rolled back since the sever was last contacted, the cursor disposition (open or closed) identified with each rollback; the savepoint identifiers of any new savepoints that are still in effect since the server was last contacted, and the savepoint identifiers of any savepoints that have been released since this particular server was last contacted.
Each participant DBMS sends a message to the coordinator DBMS identifying any savepoint identifiers that have been rolled back since the last message (along with the cursor disposition), the savepoint identifiers of any savepoints that are still in effect that were established since the last message, and the savepoint identifiers of any savepoints that have been released since the last message.
Also, each time an SQL result set is returned to a requestor, the server sends the sequence number associated with the cursor to the requester. This is so that the result set can be positioned correctly within the savepoints in the task or transaction or other unit of work.
Concurrency, coherency, accuracy, and timeliness are attained by the method, system, and program product described herein. One embodiment of the invention is a method of managing transactions on a plurality of DBMSs residing on separate participant servers and a coordinator server, where a single transaction comprises multiple steps across more than one of the plurality of DBMSs as participants.
According to the method of the invention, savepoints are assigned to the participants across a network of DBMSs, with at least one savepoint set within a multiple step transaction, that is, a transaction having multiple sub-transactions. If, as the transaction progresses, a request to rollback to savepoint is received, the method of the invention calls for rolling back operations performed since the savepoint across the DBMS participants. When the transaction is accepted by the application (even in the face of reasons to reject the transaction), the transaction is accepted across the multiple DBMS participants.
Establishing and managing the savepoints is accomplished by assigning savepoint identifiers to savepoints; sending a message from the coordinator server to the multiple database management system participants that are participants in the transaction identifying the savepoint and providing savepoint information; sending messages from the multiple database management system participants in the transaction to the coordinator server with respect to the savepoint while processing the SQL request, and providing savepoint information; and returning a transaction result set to a requester.
Sending a message from the coordinator server to the multiple database management system participants identifying the savepoint and providing savepoint information comprises sending a message from the coordinator server to the multiple database server participants actually participating in the transaction identifying savepoints that have been rolled back since the database server was last notified, savepoints that are still in effect; and savepoints that have been released. Sending messages from the multiple database management system participants to the coordinator server with respect to the savepoint and providing savepoint information comprises sending a message from the multiple database sever participants to the coordinator server identifying savepoint information since the savepoint was established, that is, savepoints that have been rolled back during this request, savepoints that are still in effect, and savepoints that have been released
A sequence number, as used herein, establishes and shows a times of result set generations and savepoint activities, associating result sets and savepoints. Returning an SQL result set to a requester comprises flowing a sequence number associated with a cursor associated with the result set to the requester so that the result set can be associated with and positioned correctly within savepoints.
A further embodiment of the invention is a distributed DBMS comprising a plurality of individual DBMSs, as participants, in a transaction residing on separate servers, and including a coordinator server. The distributed DBMS is configured and controlled to process transactions which transactions comprise multiple steps, that is, sub-transactions, across more then one of the plurality of DBMSs as participants. This is done by a method comprising assigning across the DBMS participants at least one savepoint within a multiple step transaction, and responding to application requests to rollback to savepoint by undoing operations performed by the transaction since the savepoint across the DBMS participants.
The distributed database management system is further configured and controlled to establish and manage the savepoints by the method of assigning savepoint identifiers to savepoints, sending a message from the coordinator server to the multiple database management system participants identifying the savepoint and providing savepoint information, sending messages from the multiple database management system participants to the coordinator server with respect to the savepoint and providing savepoint information; and returning a transaction result set to a requester.
A still further embodiment of the invention is a program product. The program product, which may be installation tapes, disks, diskettes, or on disks on an installation server, or storage media on the individual servers, as disk drives, tape drives, memory arrays, or a combination thereof, is in the form of a storage medium having computer readable program code thereon. This code, either as written on the medium, or as instantiated in RAM, or both, controls the operation of a distributed database management system, The system contains a plurality of individual DBMSs, i.e., participants, residing on separate servers and including a coordinator server. The program code configures and controls the database management system to process transactions, which comprise multiple steps, that is, sub-transactions, across more then one of said plurality of DBMSs as participants. This is done by the method comprising assigning across the DBMS participants at least one savepoint within a transaction having multiple sub-transactions (steps). If a reason for rejection is found the transaction is rolled back across the DBMS participants to a savepoint.
The program code of the program product configures and controls the individual DBMSs to establish and manage the savepoints by the method of assigning savepoint identifiers to savepoints, sending a message from the coordinator server to the multiple database management system participants identifying the savepoint and providing savepoint information, sending messages from the multiple database management system participants to the coordinator server with respect to the savepoint and providing savepoint information, and returning a transaction result set to a requester.
The method of the invention is useful with any DBMS that uses SQL, especially Relational Database Management Systems (RDBMS), although it may be used with, for example the Integrated Database Management System (xe2x80x9cCODASYLxe2x80x9d), various network database paradigms, and the like.