1. Field of the Invention
This invention concerns a method and associated apparatus for ensuring that multi-processing systems are able to maintain unambiguous control over communications of device status information via interruptions, and over utilization of such information in respect to starting operations at associated devices.
2. Problem Solved and Principal Objects
In contemporary tightly coupled multi-processing systems, wherein plural central processing elements (hereinafter CP elements or CP's) share an operating system (supervisory programs), main storage facilities, and devices, each attempt to start an operation of a device is predicated on status information contained in a UCB (Unit Control Block) table in main storage which is uniquely associated with the device (refer to OS I/O Supervisor Logic, GY 28-6616 Pages 3-9). In such systems CP's working relative to a shared device may have interfering access to an associated UCB with potentially destructive effects.
For example, one CP may be working to start an operation at a device shared with another CP while the other CP is handling an interruption associated with the status of the same device. These CP's may be in communication with different I/O channel, subchannel and control unit paths not commonly accessible to both. Although the CP which is seeking to start the operation may be programmed to explicitly test the status of the device, via an I/O path affiliated with that CP (refer to IBM System/370 Principles of Operation, GA22-7000 Pp 208, 209 and GY 28-661 Supra, pages 17-2), and thereby recover status information manifested in that path, such tests would not enable that CP to recover status information which is manifested in a path accessible only to another CP. However, the locking of the UCB by the first CP may prevent the other (interruption handling) CP from updating the UCB. Accordingly, the device operation may be started with reference to outdated status information which is not distinguishable as such. This can result in destructive error, and cause the central operating system to be burdened with wasteful "overhead" processes for error analysis and recovery.
For example, assume that while a first CP, CP-A, is handling an interruption associated with a manual change of disk packs in a DASD file, a second CP, CP-B, is working to start an output (writing) operation relative to the removed pack. Assume also that CP-B has exclusive (locked) access to the UCB associated with the DASD file. In this circumstance, the output operation might be started by CP-B relative to the wrong (newly mounted) disk pack, because CP-B is incapable of distinguishing the change in status associated with the interruption (since associated conditions which reflect this change will have been cleared from the connection path associated with CP-A when CP-A first accepted the interruption, and may not be manifested in the I/O path over which CP-B is attempting to operate). The resulting output operation could overwrite data previously recorded on the newly mounted pack, and thereby destroy valid (and possibly important) data.
A principal object of the present invention is to provide a method and associated apparatus for avoiding such CP interferences and associated destructive effects.
In contemporary multi-processing systems such interferences are avoided by adapting the shared device to communicate its changed status condition redundantly, to each sharing CP, over each path through which it can communicate with the CP's. However, this procedure, which is called multi-tagging of status, requires all CP's to redundantly process interruptions relative to a single status change event. Obviously, this is less efficient than having a single status change event processed by a single CP.
Therefore, another object of this invention is to provide a method and apparatus for enabling central processing elements of a multi-processing system to coordinate their handling of interruptions which relate to changes in status of a shared device, and their handling of associated status information, whereby the status communication process associated with any single status change event may be handled by a single central processing element, and yet not expose the system to potential errors in respect to the scheduling of new operations at the associated device.
It has also been proposed that I/O channels be adapted to have dynamically variable ("floating") affiliations with central processors, whereby an I/O processing subsystem shared by such channels could be delegated responsibility for assigning paths for communicating with the devices, and for managing the accounting processes associated with such assignments, and thereby relieve the central complex and its operating system of the burden of having to specify and manage such paths (refer to Clark et al's U.S. Pat. No. 3,725,864 issued Apr. 3, 1974). In such so called "floating channel" systems the inefficiency of the "multi-tagging" technique would be compounded by the dynamic variability of channel path assignments to the multiple CP's constituting the central complex, that is, it would be difficult to ensure that each CP would be redundantly required to process interruptions relative to a single status change event.
In the above-referenced Clark et al patent, it had also been proposed to provide a single subchannel for each device in the I/O processing subsystem, apparently to provide a unique communication node for each device, relative to the central complex, regardless of the number of physical channel paths which may be able to link the device to the central complex at any time. However, such concentration of subchannel storage facilities does not ensure that operations of the associated device will always be started with reference to the most current status information. For example even with multi-tagging of status, status could be cleared from the device to one CP over one path followed immediately by initiating signals from another CP to the device over the same path; thus starting the device without the initiating CP being notified of the changed status.
Accordingly the present invention seeks to provide a method and apparatus effective in such floating channel environments for enabling a CP to handle a status interruption from a shared device without exposing the system to a potentially erroneous starting operation due to concurrent operation of another CP relative to the same device.