Prior computer systems have used plural input/output processors (IOPs) having dedicated work queues (WQs) to manage CPU requests for data movement between I/O devices and computer memory. Each IOP WQ operates with a subset of I/O devices using corresponding subchannel ID numbers. Each subchannel ID number is represented internally by a queue element (QE) in Protected System Storage and is assigned to use one or more of the IOP WQs depending on the paths associated with the device the subchannel represents. The I/O devices are accessed through various physical paths, in which some of the I/O devices may be accessed through more than one path. Each path to an I/O device is through an I/O channel and a control unit (CU), and may include I/O path switches. Each IOP and associated WQ have access to a subset of paths and as a result can only service QEs with certain paths.
The IOPs manage the I/O workload requested by the CPUs in a system. Any CPU executes a start subchannel (SSCH) instruction to put a work request on an IOP WQ. Each CPU request on a WQ comprises a queue element (QE), which is a control block representing the I/O device to be started and specifying the data move information. An I/O request cannot be executed until the physical path to the requested I/O device becomes available, which includes different channel processors to control different physical paths. Thus, arequest waits if any component in its physical path is busy. If a control unit (CU) of the physical path to an I/O device is not available, the request (QE) is moved from its WQ to a Control Unit Queue (CUQ), on which the QE is parked until the CU becomes available; and then the request is moved back to its assigned WQ from which it is given to by the channel processor for its path and the channel processor controls the I/O operation.
When any I/O operation is completed, the assigned IOP generates a pending I/O interruption by: moving that QE to an interruption queue (IQ) assigned to that subchannel, and sending an interruption signal to all CPUs in the system. The first CPU available for interruption looks at the Iqs and may handle one or more pending interruptions. Each QE is assigned to use one of eight IQs associated with an interruption subclass indicated in the subchannel. The IOP subsystem redistributes work among certain of its WQs, only when multiple paths have been defined in the subchannel to access a device, the path on one WQ is busy or becomes unavailable, and a path is available on another WQ. The prior IOP subsystems do not move QEs among different WQs for utilizing idle IOPs. The prior subsystems required the WQ and IOP assigned to a subchannel to handle the QE when it is a request on any WQ. A QE could not be moved to another WQ for continuing uninterrupted execution of the QE if the IOP dedicated to the assigned WQ is busy or is failing unless the I/O operation had not begun in the channel and an alternate path also exists on another IOP. The IOPs could only be reconfigured as a group (when an associated part of a computer system was reconfigured) by removing that entire group from the computer system. If alternate paths are not available, the operation is terminated. Thus, the prior art IOP WQs and their IOPs were not used for workload balancing, recovery, or reconfiguration.
In prior computer systems, if only one path is provided for a subchannel, the QE associated with that subchannel ID must wait on its assigned WQ while its assigned path is busy, or wait in a Control Unit Queue (CUQ) while its I/O control unit is busy. Because different subchannels use different CUQs, the CUQs cannot serve as a central point for balancing IOP queue work. Accordingly in the prior art, a QE work request could only use the WQ permanently assigned to the subchannel. If a large number of Start Subchannel instructions (work requests) from one or more CPUs are directed to the subchannels assigned to a particular WQ, all of those work requests must use that WQ, even if one or more other WQs and their associated processors are empty and idle. Hence, the workload may become highly out-of-balance among the WQs in the prior IOP subsystems. Also in the prior art, if any IOP fails, a catastrophic condition exists, and the I/O requests on that IOPs WQ are put on a termination queue and terminated if alternate paths are not available or if the I/O operation had begun in the channel. Manual intervention may be required to enable new paths to allow the CPUs to bypass the failed WQ and its associated IOP, to reassign the subchannels to other operational WQs, and then to re-execute the CPU software from some prior point in an interrupted application program to repeat the CPU requests, in order to repeat the terminated I/O requests on newly-assigned operational WQs.