1. Field of the Invention
The present invention relates to processing requests on data contained in a database. More particularly, the invention concerns a method and apparatus for reorganizing a database while allowing substantially uninterrupted access to the database.
2. Description of the Related Art
Databases are used on computers for a myriad of reasons. In many cases the databases are extremely large, having entries in the millions. With large databases, the information must be available at all times on a transactional or real time basis, and large mainframe computers are usually employed to access the data. International Business Machines Corporation, (IBM), assignee of the current invention, has developed the leading database environment referred to as DB2 for use in conjunction with compatible mainframe computers.
One feature common in all database systems and included in the DB2 is the capability to index various information. The use of the index allows faster access for searches and requests based upon the indexed information. DB2 uses a balanced tree index structure. In this structure, root, tree and leaf pages are used, with each page at each level containing the same number of entries, except the last one. The leaf pages are the lowest level and each contains a number of entries referring to the actual data records contained in DB2 data tables. Each leaf page is maintained in internal logical order automatically by DB2. Tree pages are the next level up, and are used to indicate the logical order of the leaf pages.
For large databases, there may be several layers of tree pages, for example, a higher level of tree pages referencing a lower level of tree pages. Ultimately, the number of tree pages is reduced such that all the entries or references fit into a single page referred to as the root page. As in leaf pages, within each tree or root page the entries are kept in logical order automatically by DB2.
One problem with this type of index organization is the physical location of the leaf pages often becomes quite scattered. Another problem is that the rows of an indexxe2x80x94an index being ordered row by rowxe2x80x94may become scattered across multiple data pages, rather than clustered together. This scattering results in reduced performance as now the storage device must move between widely scattered physical locations if logical order operations are to be performed. This is true of whatever type of direct access storage device (DASD) is used to store the index or data file. Therefore, the files, including the index file, need to be reorganized periodically so that the logical and physical ordering between the leaf pages and data pages better correspond. However, current methods used to reorganize the files require access to the files to be restricted for the most part of the reorganization process. In a database that requires 24xc3x977 availability, that is, twenty-four hours-a-day, seven days-a-week accessibility, long durations of data unavailability are unacceptable.
One example of a database requiring 24xc3x977 availability is a financial database for storing a bank""s records. Regular record reorganization is required to minimize storage overhead for the ever changing records. However, a bank cannot afford to xe2x80x9cclose downxe2x80x9d record access to reorganize its database. Customer service requires access during the day, and processing other transactions, commonly occurring at night, requires nighttime access.
Recognizing that a reorganization utility can be one of the largest inhibitors to data accessxe2x80x94reorganization utilities commonly block access to the data by other utilities and applications during the reorganization processxe2x80x94several solutions have been proposed to make data available more of the time. These xe2x80x9conlinexe2x80x9d reorganization methods help minimize data xe2x80x9coutages.xe2x80x9d
A major drawback to these techniques is that they require a reorganizational process to request a xe2x80x9cblocking drainxe2x80x9d, also known as a lock, on a resource, thereby making other processes wait. For example, if reorganizational process B requests a lock, it must wait until a process A, which already has a lock, finishes. If another process C comes along before process A finishes, process C queues up behind process B and must wait for both A and B to finish. Once A finishes, B locks the database and process C continues to wait until reorganizational process B can record a starting point. The wait experienced by process C can be substantial if process A is long running or does not complete, or if process C must also wait for additional processes preceding A to finish.
Accordingly, there is a need for an online database reorganization technique using a xe2x80x9cnon-blockingxe2x80x9d drain which will wait for a resource without blocking other requests on that resource. Referring to the above example, there is a need for a technique where process C can access the database while process B waits for process A to finish so that a reorganization starting point, or logical record sequence number (LRSN), can be established by B, thereby allowing the reorganization process to begin.
There is also a need for an online database reorganization technique using a non-blocking drain which allows access to a database during the reorganization of the data and minimizes unavailability of the database in completing the reorganization process.
Broadly, the present invention concerns an online database reorganization process using a non-blocking drain. The process does not block other process""s requests for access to the target database during the data reorganization, even if the reorganization process is waiting for the target database to become available. The reorganization technique only briefly causes the database to be unavailable when completing the reorganization process.
In one embodiment, the invention may be implemented to provide a method to reorganize a database that does not prevent other processes from accessing the database while the reorganization is in progress. The method uses a non-blocking drain to lock on an original database or xe2x80x9cresourcexe2x80x9d, unloads data to be reorganized from the resource, assigns a LRSN, reorganizes the copied data, and loads it into a shadow location. Log records may be used to adjust the data in the shadow location to account for changes to resource data that occurred after the data was unloaded. Lastly, the data in the resource is replaced with the reorganized data.
In another embodiment, the invention may be implemented to provide an apparatus comprising a database and a digital processing device. The digital processing device may be configured in one embodiment to receive data from the database and then is used to reorganize and restore the database as disclosed immediately above.
In still another embodiment, the invention may be implemented to provide a signal-bearing medium tangibly embodying a program of machine-readable instructions executable by a digital data processing apparatus to perform a method for reorganizing a database.
The present invention affords its users with a number of distinct advantages. One advantage is that the invention provides substantially continuous access to the database while the reorganization process is executing or waiting to execute. Another advantage is that the invention provides for a non-blocking drain which is different than a blocking drain. The non-blocking drain allows the reorganization process to lock and queue while earlier-processesxe2x80x94processes which requested database access before the reorganization processxe2x80x94to complete their routine. At the same time, database access by later-processes, that is, processes requesting database access after the reorganization process, is not impeded by the non-blocking drain.
Another advantage of the present invention is that the only time the database is inaccessible to processes other than the reorganization process is briefly when the data in the original database is replaced with the reorganized data. Furthermore, the invention also provides a number of other advantages and benefits, which should be apparent from the following description of the invention.