IMS is a hierarchical database management system (HDBMS) developed by International Business Machines Corporation. IMS has wide spread usage in many large enterprises where high transaction volume, reliability, availability and scalability are of the utmost importance. The high reliability and availability is achieved in part by the incorporation of logging within IMS.
Two types of system logs are found within IMS. One type of log is the IMS OLDS (Online Log Data Sets) log. The OLDS log is the primary receiver of system-generated records that capture important data processing event-related information during IMS processing. Logs, and the information contained therein, are used for many different purposes. Typical examples include monitoring and/or tracing IMS transaction/database activity, creating audit trails and debugging transaction or database related problems.
IMS logs may be very large in IMS systems where it is quite common for thousands of transactions may be processed every minute of every day in a 24/7 computing environment. Individual log records may also be very large, where a single logical log record may span several physical blocks within the log file. Furthermore, within a given log file, there may be substantial variation in data format, including unformatted string data comprising character data, binary data and other coded information making meaningful translation for a human reader very difficult.
Each log record is identified by a one-byte hex field known as a “type code”. A log record may also optionally contain an additional one-byte “subtype code”. In order to properly interpret and provide meaning for a given log record, a DSECT (Dummy Section) matching the type code and subtype code for the given record must be utilized. The large number type codes and subtype codes, in combination with large log file sizes and large record sizes, results in an extraordinarily tedious process in attempting to perform any type of manual analysis of log data. A manual analysis may be further exasperated in an environment where multiple logs may be relevant to the task at hand.
There are a finite number of OLDS logs available online to an IMS subsystem. This log pool is defined to IMS utilizing a system parameter. Although the number of logs can be increased, it is not feasible to have a pool of online logs with sufficient capacity to accommodate logging activity over an indefinite period of time. Consideration must be given to IMS environments where one or more IMS subsystem may be running for months at a time. To accommodate extended time processing environments, IMS reuses OLDS logs, as briefly described infra.
Whenever the current log fills up, or whenever an IMS command is issued to switch logs, IMS switches to the next log in the OLDS log pool. However, before an OLDS log is reused, the contents of the log is archived to an SLDS (System Log Data Sets) log. In this way, a limited number of OLDS logs can accommodate all logging for an IMS subsystem over an extended time period. However, those skilled in the art will recognize that an infinite number of log records can be generated, as there is no limitation on the number of SLDS logs that can be created as processing continues for many months and years.
It may be necessary to perform an analysis of log data, for reasons discussed supra, covering a particular period of time. The chosen time period may involve only SLDS logs, only OLDS logs, or a combination of both SLDS logs and OLDS logs. However, log analysis is typically performed on SLDS logs. The term “IMS log” or “log” is hereinafter used to refer to either an OLDS log or an SLDS log in those contexts where it is not necessary to distinguish between different log types.
In order for a plurality of logs to contribute to productive analysis, it is typically a requirement to have no intervening time gaps between individual log files comprising the set of log files to be analyzed for a given IMS subsystem. Although the logs are generally created at one physical location for a given IMS subsystem, it is not unusual to physically ship these logs to a different location for analysis, such as a central service center.
Database Recovery Control (DBRC) is a feature of IMS, which, among other things, tracks the use of OLDS and SLDS logs. The use of this tool to achieve a set of logs with no intervening time gaps may be cumbersome to use, time consuming and error prone in some computing environments. For example, the use of DBRC may be impractical where logs are transmitted to a service center. This is because the logs may become renamed such that they no longer match the DBRC-registered names. Furthermore, the service center may not have an installed DBRC for the logs being received even in those cases where the logs have not been renamed. Accordingly, a set of log thought to be complete may actually have missing logs or extra extraneous logs. Improperly groomed logs may happen for a variety of reasons, including the use of loosely contrived naming conventions which are not clearly defined, mixing logs from multiple IMS images, inadequate research, as well as simple human error.
Once the logs are received by a service center, it may take several hours or possibly days to determine what logs have been received and to establish the proper time-order sequencing for processing. Errors detected at this stage are a source of substantial further delays. Customers are contacted with a blur of information reflecting confusion over what exactly was shipped and why it was irregular in its receipt at the service center. Frequently substantial additional research and rework is required to ship a new set of logs.
IMS logs contain variable length records resulting in the log sequence number (LSN) being positioned at different physical offsets within each record. This characteristic precludes the use of standard sort utilities, which require the field controlling the sort to be at a fixed physical location within the set of records to be sorted.
Accordingly, there is a great need to provide a simple, quick and automated way to sort through a set of log files whereby comprehensive verification of the logs is easily accomplished as preparation for further log analysis. This verification, as discussed supra, includes verifying that a set of logs is complete, organizing the logs by IMS image, properly time-sequencing the logs, and identifying gaps in the sequence including the specific time range unaccounted for. This above described verification is hereinafter referred to as “log grooming”.