Field of the Invention
This invention relates to systems and methods for collecting diagnostic information associated with an I/O error.
Background of the Invention
In the z/OS operating system, control blocks are used to manage the work and resources of a host system. These control blocks are represented internally as real, virtual, and/or hard storage areas and typically contain specific information pertaining to events, activity, and status occurring within the host system. Control blocks in most situations are chained to one another and can span many areas of the z/OS operating system's internal structure. Knowledge of control blocks is useful in determining vital information about the host system and its status when a failure occurs.
When an input/output (I/O) request is generated by an application running on a host system, the I/O driver builds a control block called an I/O Supervisor Block (IOSB). The IOSB describes the I/O request and passes parameters to and receives responses from an Input/Output Supervisor (IOS). When an I/O error occurs, information in the IOSB control block is often needed to identify what channel program was used to read/write data to a specific device. Unfortunately, by the time an SVC (Supervisor Control) dump is taken to ascertain the contents of an IOSB, the IOSB is often already reused by another application. As a result, data in the IOSB is often stale by the time the SVC dump is taken and is not useful to ascertain the root cause of the I/O error.
Diagnostic tools such as SVC dump that collect trace data are often disabled during normal operations to reduce overhead. Thus, trace data may not be collected the first time an I/O error occurs. Although a user may try to recreate the I/O error after enabling diagnostic tools, the I/O error often cannot be recreated or cannot be recreated in time to collect desired diagnostic information. This may make it difficult or impossible to determine the root cause of the I/O error.
In view of the foregoing, what are needed are systems and methods to more effectively collect diagnostic information associated with I/O errors.