1. Field of the Invention
The present invention relates to computer software, and more particularly to reorganizing databases.
2. Description of the Related Art
The IMS database (IMS DB) was created in 1970 by International Business Machines Corporation (IBM) and is one of the two major parts to IBM""s IMS/ESA (Information Management System/Enterprise Systems Architecture). The second part is a data communications system (IMS Transaction Manager or IMS TM). Together, the transaction manager and the database manager create a complete online transaction processing environment providing continuous availability and data integrity. IMS/ESA runs under the MVS/ESA or OS/390 operating systems, which run on the S/390 platform.
At the heart of IMS DB are its databases and its data manipulation language, Data Language/I (DL/I). The IMS database is a hierarchical (non-relational) database. IMS databases are hierarchic collections of data, information organized in a pyramid fashion with data at each level of the hierarchy related to, and in some way dependent upon, data at the higher level of the hierarchy. DL/I calls allows a user to create and access these IMS databases.
An IMS database may include one or more data set groups. Each data set group may include one or more segments. A segment is the smallest piece of data DL/I can store. Each segment may be qualified by its hierarchical relationship to other segments in a database record. Each database record has one root segment and zero or more child segments. A xe2x80x9croot segmentxe2x80x9d is at the top of the hierarchy, and there may be only one root segment in a database record. All other segments (other than the one root segment) in a database record are referred to as xe2x80x9cdependent segmentsxe2x80x9d, and their existence depends on there being a root segment. A xe2x80x9cparent segmentxe2x80x9d is any segment that is defined in the database descriptor (DBD) as capable of having a dependent segment beneath it in the hierarchy. A xe2x80x9cchild segmentxe2x80x9d is any segment that is a dependent of another segment above it in the hierarchy.
Segments may be of various segment types. Those segments which share similar qualities are of the same type. For example, if the root segment of a database record represents a course, and that root segment has three child segments labeled: instructor, student, and location, those child segments may be referred to as segment types.
The root segment is referred to as a first level of the IMS database, direct children of the root segment are referred to as a second level of the IMS database. As used herein, a second level of the IMS database may alternatively be referred to as a first level child segment, as child segments may only appear starting with the second level of the IMS database. Similarly, children of the children of the root segment (i.e., grandchildren of the root segment) are referred to as a third level of the IMS database, or alternatively, second level child segments. The level of each subsequent generation of children may be determined by incremented the previous level by One (e.g., a fourth level of the IMS database is equivalent to a third level child segment).
An IMS database includes ten data set groups into which segments of an IMS database may be written. Each segment type may only be assigned to one data set group. When IMS databases are created, definitions of which data set group each segment type is to be written to are specified. In some cases, an IMS database may also be divided into partitions, in addition to being distributed across data set groups. A database record is made up of a root segment and child segments. As an IMS database is used, segments and database records are added, modified and deleted. Over time, the child segments of a database record may become scattered across different blocks within a data set group, resulting in slower access times and longer latencies than would occur if the child segments were closer together. Reorganizing the location of the various segments of an IMS database such that segments of database records are closer together results in faster access times and shorter latencies.
The need to reorganize an IMS database stems from the dynamic nature of insertions and deletions of segments in an IMS database. In general, as new child segments are added to an IMS database hierarchy, the segments may be added to blocks depending on space availability. As a result, related segments (i.e., segments belonging to the same database record) may be stored in different blocks, possibly non-contiguous blocks. This results in a fragmented database, as shown in FIG. 3. As a result, access of a database record may require reading a number of non-contiguous blocks, which results in lengthier access times. One method of reducing access times is to reorganize the IMS database in order to more closely position segments belonging to the same database record.
The current technique of reorganizing an IMS database requires that the IMS database be off-line. After the database is brought back on-line, access times may be better than before the reorganization. However, during the reorganization users have no access to the database records.
Therefore, reorganizing an IMS database in order to speed up the access times and reduce the latencies is more desirable to the user if user access to the IMS database may be maintained during the reorganization (i.e., on-line reorganization). For at least the foregoing reasons, there is a need for an improved system and method for reorganizing databases, such as IMS databases, in a more efficient manner.
The present invention provides various embodiments of an improved method and system for on-line reorganization of an IMS database while allowing concurrent updates.
In one embodiment, the method involves building and dynamically maintaining a map of free blocks in the IMS database. The user then provides a list of candidate database records to be analyzed. This list may include a keyword (i.e., xe2x80x9cALLxe2x80x9d), or may identify individual database records.
The physical characteristics of each database record on the list are then analyzed. Physical Locator (PL) trace records are built for each database record on the list. The PL trace records contain physical location information for each segment of each database record on the list. The PL trace records are used to calculate a total number of physical blocks currently used to hold each database record on the list and a minimum number of physical blocks needed to hold each database record on the list. These calculations are made in order to identify fragmented database records and the segments which contain fragmented boundary twin chains. A reorganization recommendation list is then created for each database record on the list, including fragmented boundary twin chains.
Each fragmented database record on the reorganization recommendation list is then reorganized. This reorganization process includes the following steps: a) determine the number of blocks needed; b) assign and protect the number of blocks needed; c) identify the assigned blocks; d) retrieve the database record, delete the database record, and insert the database record into the identified blocks; e) commit the changes to the database.
This process of analyzing the physical characteristics of each database record on the list and reorganizing the database records may be continued until each database record on the list is no longer fragmented. User access to the database is maintained (i.e., the database is on-line) during the analyzing and reorganizing processes.
Following the analysis of each database record individually, the IMS database as a whole may be analyzed to determine if the IMS database is disorganized from a database record to database record standpoints. If such a disorganization is found, then the IMS database is subsequently reorganized, database record by record.