Database management systems (DBMS), and the applications that use them (“database applications”), are constantly evolving. An example of a DBMS is the Oracle™ database server, produced by Oracle Corporation. In the evolution of DBMSs and database applications, there are many scenarios that require converting the data type of a column in a database table, to support new features and enhancements to the database application or the DBMS. One such scenario involves LONG columns, which store large amounts of data in a table column. Some features of a DBMS that have evolved cannot be used for a table that contains LONG columns. To overcome this limitation, columns with LONG data type must be converted to a LOB (“Large Objects”) data type.
A type of approach for converting the data type of a table column is referred to as offline conversion. In offline conversion, a process dedicated to converting a column and its values is given exclusive access to the table. No transactions for other processes are allowed against the table. General access to the table is revoked. One approach to offline conversion is offline “in-situ” conversion, which involves adding a new column (“in situ”) for a new data type, converting values from rows in the original column to the new data type, updating the new column of the rows with these values, and then publishing the new column as the original column.
Processes involved in converting a data type of a column are referred to herein as conversion processes. The period of time during a conversion when access to a table is limited to conversion processes is referred to herein as downtime. The conversion time is approximately proportional to data change size, therefore offline conversion demands significant downtime if the table contains large amounts of data.
Processes running on a DBMS that are not dedicated to converting a data type are referred to herein as non-conversion processes. Non-conversion processes include, for example, processes associated with a user session, which are typically established in response to a database connection issued by an entity (e.g. application, client) identifying and authenticating itself as a particular user. A DBMS associates a name, authenticating information (e.g. password), access privileges (e.g. access to a particular schema or database objects, security level) with a user.
For convenience of expression, a table that contains a column whose data type is being converted and whose column values are being converted to the new data type is referred to herein as a table undergoing conversion. Thus, the phrase “a table under going conversion” is just a convenient way of expressing that a data type of a column in the table is being converted and the values in the column are being converted to the new data type.
Another variation of offline conversion is offline replacement-table conversion. Under this variation of the offline conversion, a “replacement table” is created with columns that are identical to the table undergoing conversion, except that the replacement table has a new column with the new data type. The new column will become in effect the original column with the new data type. The entire data from the original table is copied to the replacement table, and values from the original column are converted to the new data type and inserted into the new column. In addition, indexes and other database objects dependent on the original table are built on the replacement table. The replacement table is then published as the original table. During the period of conversion, access to the tables is limited only to the conversion process.
Another type of approach for converting the data type of a column is online conversion. The online conversion removes most access restrictions on the table undergoing conversion, allowing non-convert processes to access the table while the table is undergoing conversion. Online conversion, however, does involve some downtime.
A form of online conversion is online replacement-table conversion. Similar to offline replacement-table conversion, this form of online conversion uses a replacement table. The replacement table is created with columns that are identical to the “original table” and a new column of the new data type. Indexes and other database objects dependent on the original table are built on the replacement table. The entire data from the original table is copied to the replacement table and values of the old data type from the original column are converted to the new data type and inserted into the new column.
Unlike offline conversion, however, during online replacement-table conversion, non-conversion processes are allowed access to the original table. Changes made by transactions of non-conversion processes are tracked in a log. After all the data from the table have been copied and converted to the new table, the replacement table is synchronized, that is, the transactions recorded in the log are applied to the replacement table, a process that may also involve conversion of original column values. The synchronization process may occur one or more times depending on the frequency of changes made by non-conversion processes.
Under online replacement-table conversion, there are two periods of downtime. During the first downtime, the infrastructure needed to support the conversion is created. This operation includes creation of the log for the original table and a replacement table with a new column for the new data type. The creation of the replacement table does not, in itself, require downtime. If the DBMS allows creation of a log during uptime, then a period of downtime would not be required for creating the log.
The second downtime occurs at the end of the conversion. A purpose of this downtime is to prevent transactions from changing the original table. A final synchronization is performed during this downtime to apply any transactions from the log not yet applied.
Another important purpose of the second downtime is to coordinate publication of the replacement table with the installation of the new version of database applications that have been adapted to use the new data type for the converted column. Publication refers to making changes to a DBMS that are needed to cause the replacement table to be recognized and treated as the original table by the DBMS. Such changes include changing a DBMS's metadata. It is important that the beginning and end of the downtime be synchronized with the installation of the new version of database applications. The old version of the database application expects the converted column to have the old data type while the new version expects the converted column to have the new data type. The downtime provides a period of time in which to install the new version of the database application so that the new version is running when the replacement table is published but not before.
An advantage of online replacement-table conversion is that it requires potentially less down time. Conversion processes do not require exclusive access during conversion. Database applications may modify a table during conversion. Nevertheless, the downtime can be significant if a large amount of data changes from the log needs to be processed during the final synchronization.
A disadvantage of online replacement-table conversion is that it requires more computer resources than offline conversion, particularly offline in-situ conversion, which does not use a replacement table. Storage space is not only needed for the replacement table and associated database objects, but also the log. Furthermore, CPU resources are needed to copy the entire data from the original table to the replacement table, to recreate the associated database objects, and to record transactions to the log and apply them to the replacement table. If the original table is very large, the amount of resources consumed to convert the data type of a column can be very significant.
The burden placed on resource usage is further aggravated by the fact that column data types often must be converted for batches of tables at a time. Often, database systems and base applications evolve in cycles that involve column data type conversion for multiple tables. A new version of an application may contain changes needed for all the column data type changes in a cycle. As a result, all the tables must undergo conversion simultaneously. For the forms of conversion that use a replacement table, converting many tables simultaneously requires storing a copy of many tables.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.