The present invention relates to providing atomicity for statements executed by a database server.
In typical database systems, users store, update, and retrieve information by interacting with user applications (xe2x80x9cclientsxe2x80x9d). The clients respond to the user""sinteraction by submitting commands to a database management system (a xe2x80x9cDBMSxe2x80x9d) responsible for maintaining the database. The database server responds to the commands by performing the specified actions on the database. To be correctly processed, the commands must comply with the database language that is supported by the database server. One popular database language is known as Structured Query Language (SQL).
Various access methods may be used to retrieve data from a database. The access methods used to retrieve data may significantly affect the speed of the retrieval and the amount of resources consumed during the retrieval process. Many access methods use indices to increase the speed of the data retrieval process. A typical DBMS has built-in support for a few standard types of access methods, such as access methods that use B+Trees and Hash Tables, that may be used when the key values belong to standard sets of data types, such as numbers, strings, etc. The access methods that are built-in to a database system are referred to herein as native access methods.
In recent years, databases are being used to store different types of data, such as text, spatial, image, video, and audio data. For many of these complex data types, the standard indexing techniques and access methods cannot readily be applied. To provide efficient data retrieval, many database systems that allow users to store complex data types attempt to provide access methods suitable for the complex data types. However, attempts to provide native support for all types of access methods are unrealistic, because it is not possible to foresee all the possible types of complex data that clients may wish to store in a database, much less all types of access methods that one may wish to use with such data types. To allow database management systems to support non-native index schemes, a mechanism referred to as xe2x80x9cextensible indexingxe2x80x9d has been developed. Extensible indexing is described, at least in part, in xe2x80x9cExtensible Indexingxe2x80x9d, U.S. application Ser. No. 08/677,159, filed on Jul. 7, 1996 now U.S. Pat. No. 5,893,104 by Jagannathan Srinivasan, Ravi Murthy, Chin Hong, Samuel Defazio, and Anil Nori, and assigned to the assignee of the present application.
Extensible indexing provides a general framework for integrating non-native indexes. The user supplies meta-data defining an index type. In particular, the meta-data defines access methods for the index type, and methods for creating and maintaining data for the index type. Indexes that are managed by methods supplied in this manner are instances of a defined index type, and are referred to herein as domain indexes. The process of creating an index refers to generating metadata defining the index, and generating data structures that hold the index data (e.g. tables). The process of creating an index may also include creating index data reflecting data that already exists in the table being indexed (xe2x80x9cbase tablexe2x80x9d), when the base table contains data at the time the index is created.
Domain indexes, like indexes in general, are created in response to database definition language (DDL) statements issued to a DBMS that specify the creation of indexes. A statement is a set of one or more commands executed as an atomic unit. A DDL statement is statement that specifies a definition or change to a definition of a database schema object. A database schema object is a data structure managed by a database management system that may be referenced in a database language, and includes tables, indexes, views, materialized views, data types, and procedures. The operations performed by a DBMS to execute a DDL statement are referred to as DDL operations. DDL operations include creating, changing or deleting database metadata, and data structures used to hold data for the database schema object. For example, in response to receiving a DDL statement specifying the creation of an index, database metadata defining the index is created, one or more database schema objects are created, and index data is generated and stored in these objects.
To process a DDL statement, conventional DBMSs may follow a one statement, one transaction model. The changes specified by a DDL statement are processed by the database server as a single transaction. When a DBMS receives a DDL statement, the DBMS commits before and after executing the DDL statement. A transaction is an atomic unit of work, which may include one or more statements. The term commit refers to making permanent the changes to data specified by a transaction.
Treating the process of creating a domain index as one transaction causes several problems. Some of these problems stem from the DBMS performing callouts to user supplied routines to create domain indexes. A callout is call to an external routine supplied by a user. A callback is call from a callout to the database server.
The following example illustrates the problems that arise from callouts used to generate indexes. A user issues a command statement specifying the creation of a domain index. In response, the DBMS creates metadata describing the domain index, and issues a callout to a user supplied routine for creating the index. The user supplied routine issues a callback, which requests the execution of a DDL statement for creating a table for the index.
In many database systems that follow the one statement, one transaction model for DDL statements, the issuing of a DDL statement causes a commit before and after executing the statement. Thus, after the DDL statement is executed, the work performed before creating the table, and the creation of the table definition itself, is committed. The callout performs additional work, then encounters an error, and aborts the current transaction. As a result, work performed after creating the table definition is rolled back, while the work performed beforehand remains committed.
There are several undesirable consequences to leaving part of the work committed in this manner. The database is not left in the state that existed before the user issued the statement to create the domain index, thus violating the one DDL statement, one transaction model followed by conventional DBMSs for DDL statements. In addition, statement atomicity is also violated. Statement atomicity requires that all the work specified by a statement be performed as an atomic unit. If the DBMS cannot execute the statement completely, any changes caused by executing the statements are rolled back.
Executing DDL statements for domain indexes as one transaction may not be desirable anyway. To create a domain index for a large file, a DBMS may generate a large amount of data, expending a large amount of work. When the work of creating an index is performed as part of one transaction, and the transaction is aborted, the transaction is rolled back, and all the work is wasted.
Callbacks not only cause problems for creating indexes, they may also cause problems when modifying data in a domain index. For example, when a DBMS server receives a statement to modify data in the database (a xe2x80x9cDMLxe2x80x9d statement), and the data is indexed by a domain index, the DBMS invokes a user supplied routine for modifying the domain index. The user supplied routine may execute a commit, causing the current transaction to commit. If the DBMS is unable to complete execution of the DML statement, part of the work performed is committed and can not be rolled back. This situation, as explained earlier, violates statement atomicity.
Based on the foregoing, it is clearly desirable to provide a level of atomicity for executing DDL statements when creating indexes, while providing a means of preserving work performed when the process of creating a domain index is aborted. It is also desirable to provide a method for ensuring statement atomicity when the execution of DML and DDL statements involves callouts.
Described herein is a framework for providing statement atomicity for DDL statements. According to an aspect of the present invention, the framework allows the ability to perform, as multiple transactions, the DDL operations specified by a DDL statement. To begin execution of a DDL statement, a DBMS, for example, updates a flag to indicate that DDL operations have commenced. While the flag is set to this state, the DBMS prevents execution of operations that depend on the DDL statement being executed as an atomic unit. If the DDL operations are aborted, the flag is set to a state that indicates that the execution of the DDL operations did not complete, and the DBMS continues to disallow dependent operations that depend on the atomicity of the DDL statement. Because the flag is used to provide statement atomicity, DDL operations may be performed as multiple transactions.
According to another aspect of the present invention, a mechanism preserves the transactional context of a DML statement being executed. When, for example, a DBMS is executing a transaction and generates a callout, the DBMS prevents operations that may change the transactional context of work performed in response to the callout.