1. Field of the Invention
The present invention relates to a method, system, and program for managing file names during the reorganization of the database files that include data for the database objects subject to the reorganization process.
2. Description of the Related Art
To reduce the amount of time to physically access database records from storage, it is desirable to physically store related database records near one another. Database records may be logically related according to a key or index value. This process of storing records ordered according to a key value close to one another is referred to as clustering. Clustering describes the arrangement of the rows of a database in physical storage according to the ordering of index keys. In this way, with clustering, data records are arranged in storage such that the logical ordering of records according to index keys corresponds to the physical ordering of the data records on the storage device. If related records are stored in physical proximity on a hard disk drive surface, then the database program will access records faster, thereby improving system performance. Access time is reduced by storing related records in close physical proximity because the seek and rotational times to move from one record to the next on the storage device, such as a hard disk drive, are minimized.
Nevertheless, as records are inserted and deleted from a database table, the degree of clustering is degraded. Description of the causes for degradation of clustering is described in an International Business Machines Corporation (IBM) publication entitled xe2x80x9cA Method for On-Line Reorganization of a Database,xe2x80x9d by G. H. Sockut, T. A. Beaving, and C.-C. Chang, having IBM document no. G321-5651 (March 1997) and the commonly assigned patent entitled xe2x80x9cInteraction Between Application of a Log and Maintenance of a Table that Maps Record Identifiers During Online Reorganization of a Database,xe2x80x9d U.S. Pat. No. 5,721,915, which publication and patent are incorporated herein by reference in their entirety.
Writing and updating of records in a database can reduce the degree of clustering, which increases the time to access related database records as such records are not maintained within close physical proximity. Database programs include a reorganization process to restore clustering to improve access performance. Reorganization also improves space utilization by removing dropped tables and rows, eliminating pointers to overflow records, etc. On-line reorganization methods seek to minimize the time during which the database is unavailable for users to access. Allowing users to access databases during reorganization is essential for large or highly available database where continuous availability is crucial. Examples of highly available databases include those for reservation systems, finance (especially global finance), process control, hospitals, police and armed forces. Even less essential applications prefer high availability. Further reorganization of very large databases can take considerable amounts of time. Thus, taking such less essential, large databases off-line for a substantial amount of time is also undesirable. On-line reorganization methods strive to minimize the time during which the database is unavailable.
Current on-line reorganization methods first unload or copy out the data from the old (original) copy to a shadow copy. The shadow copy of unloaded data is then sorted and ordered by a clustering key to optimize the clustering of the data in the shadow copy according to an index key. While the shadow copy is being sorted, applications can read and write data to the old copy. Any updates, i.e., writes, to the old copy while the shadow copy is being sorted and clustered are entered in a log. The reorganization applies to any database objects related to the database object being organized. For instance, if a table space and index space are being reorganized, then all tables within the table space and the index in the index space are also reorganized. In a third phase, the reorganization routine updates the shadow copy with the logged entries of updates to the old copy to make the shadow copy reflect recent update activity.
At the end of the phase of applying the log updates to the shadow copy, the SWITCH phase begins during which the updated shadow copy is renamed to the old copy. During the SWITCH phase, any access requests to the database objects involved in the reorganization are queued until the SWITCH phase is complete. Access requests are delayed and may time-out if the SWITCH phase exceeds a pre-defined time-out period. The renaming process involves renaming the old copy to a temporary name and renaming the shadow copy to the name of the old copy. The old copy may then be deleted. With this method, there are two renaming operations for each database object to be renamed. This SWITCH process can take several minutes during which the database is off-line.
There is thus a need in the art to improve the reorganization process to further minimize the time during which a database is off-line, which is especially problematic for highly available databases.
To overcome the limitations in the prior art described above, preferred embodiments disclose a system, method, and program for reorganizing at least one database object. The database object is comprised of at least one database file. Each database file has a name. Source database files including data for the database objects subject to the reorganization have source names. Shadow copies of the source database files are created and shadow names for the shadow copies are generated, such that the source names and corresponding shadow names are different. The data in the shadow copies is reorganized. After the reorganization, the shadow names are used to access the database files for the reorganized database objects.
In further embodiments, a database file name is comprised of multiple elements including a qualifier element.
Additional embodiments include system information indicating the name of the database files that include data for database objects. The system information is processed to determine the source database files that include data for the database objects subject to the reorganization. The system information is modified to indicate that the database files for the reorganized database objects are the shadow names. The shadow names are different from the source names of the source database files from which the shadow copies were created.
In yet further embodiments, the system information includes a value of the qualifier element, wherein the value of the qualifier element is one of a first value and a second value. During the process of generating the shadow name, the system information is processed to determine whether the qualifier element of the source database file is one of the first value and the second value. The shadow name is then set to the source name. The qualifier element in the shadow name is set to the first value after determining that the qualifier element indicated in the system information is the second value or set to the second value after determining that the qualifier element indicated in the system information is the first value.
Preferred embodiments provide a method for indicating to a database program the names of the reorganized database files. With preferred embodiments, the source database files that are reorganized do not need to be renamed nor do the reorganized database files need to be renamed to the source names of the database files that were subject to the reorganization. Instead, system information is updated to identify the shadow names of the reorganized database files as the new reorganized database files. Thus, the shadow copies are made the source database files by updating system information. The database program processes this updated system information to determine the database file to access when accessing the reorganized database objects. In this way, preferred embodiments further minimize the time the database objects are unavailable for access by avoiding the step of renaming the reorganized copies of the database files to the source names of the database files.